[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Attachment: 8143v2.094.txt Patch for 0.94. HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: stack Priority: Critical Fix For: 0.98.0, 0.96.1 Attachments: 8143.hbase-default.xml.txt, 8143doc.txt, 8143v2.094.txt, 8143v2.txt, OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Resolution: Fixed Release Note: Committed 0.96 and trunk. Thanks for reviews. Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: stack Priority: Critical Fix For: 0.98.0, 0.96.1 Attachments: 8143.hbase-default.xml.txt, 8143doc.txt, 8143v2.txt, OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541)
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Attachment: 8143v2.txt Implement Enis's suggestion on how to set SSR buffer value. HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.98.0, 0.96.1 Attachments: 8143.hbase-default.xml.txt, 8143doc.txt, 8143v2.txt, OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Assignee: stack (was: Enis Soztutar) Status: Patch Available (was: Open) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.95.0, 0.94.7, 0.98.0 Reporter: Enis Soztutar Assignee: stack Priority: Critical Fix For: 0.98.0, 0.96.1 Attachments: 8143.hbase-default.xml.txt, 8143doc.txt, 8143v2.txt, OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Attachment: 8143.hbase-default.xml.txt Setting dfs.client.read.shortcircuit.buffer.size in hbase-default.xml. How is this? Should backport it too. HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.98.0, 0.96.1 Attachments: 8143doc.txt, 8143.hbase-default.xml.txt, OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} --
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8143: - Fix Version/s: (was: 0.94.13) Removing from 0.94. Can we force HDFS default settings? HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.98.0, 0.96.1 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Attachment: 8143doc.txt Its a client-side config no? Not for hdfs-side. Here is a bit of doc for the reference guide that recommends setting this down from its default size. Does this do? If so, I'll commit (I try to clean up the stale SSR section a little too). HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.98.0, 0.96.1 Attachments: 8143doc.txt, OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Priority: Critical (was: Major) Fix Version/s: (was: 0.96.0) 0.96.1 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.98.0, 0.94.13, 0.96.1 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8143: - Fix Version/s: (was: 0.94.12) 0.94.13 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.96.0, 0.94.13 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Fix Version/s: (was: 0.95.2) 0.96.0 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.94.12, 0.96.0 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA,
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8143: - Fix Version/s: (was: 0.94.11) 0.94.12 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.95.2, 0.94.12 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8143: - Fix Version/s: (was: 0.94.10) 0.94.11 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.95.2, 0.94.11 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8143: - Priority: Major (was: Critical) Fix Version/s: (was: 0.94.9) 0.94.10 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Fix Version/s: (was: 0.95.1) 0.95.2 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8143: - Fix Version/s: (was: 0.94.8) 0.94.9 Anyway, since there is nothing really to do on the HBase other then documenting a better default (iff short circuit reads are enabled), I'm pushing this to 0.94.9/ HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.98.0, 0.95.1, 0.94.9 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-8143: - Priority: Critical (was: Major) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.98.0, 0.94.8, 0.95.1 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8143: - Fix Version/s: (was: 0.94.7) 0.94.8 Moving out to 0.94.8 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.98.0, 0.94.7, 0.95.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.98.0, 0.94.8, 0.95.1 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8143: - Fix Version/s: (was: 0.95.0) 0.95.1 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.95.0, 0.98.0, 0.94.7 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.95.1, 0.98.0, 0.94.7 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA,
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-8143: - Attachment: OpenFileTest.java Attaching simple test code. If you run this against Hadoop-2.0.3-alpha, with ssr on, and -Xmx=1g, -XX:MaxDirectMemorySize=1g, you would see, {code} numFiles: 940 numFiles: 950 numFiles: 960 numFiles: 970 numFiles: 980 Exception in thread pool-2-thread-14 java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:59) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.hadoop.hbase.OpenFileTest.readFully(OpenFileTest.java:131) at org.apache.hadoop.hbase.OpenFileTest$FileCreater.createAndOpenFile(OpenFileTest.java:74) at org.apache.hadoop.hbase.OpenFileTest$FileCreater.run(OpenFileTest.java:57) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) {code} HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.95.0, 0.98.0, 0.94.7 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.95.0, 0.98.0, 0.94.7 Attachments: OpenFileTest.java We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at
[jira] [Updated] (HBASE-8143) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM
[ https://issues.apache.org/jira/browse/HBASE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-8143: - Summary: HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM (was: HBase on Hadoop 2 with local hort circuit reads (ssr) causes OOM ) HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM -- Key: HBASE-8143 URL: https://issues.apache.org/jira/browse/HBASE-8143 Project: HBase Issue Type: Bug Components: hadoop2 Affects Versions: 0.95.0, 0.98.0, 0.94.7 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.95.0, 0.98.0, 0.94.7 We've run into an issue with HBase 0.94 on Hadoop2, with SSR turned on that the memory usage of the HBase process grows to 7g, on an -Xmx3g, after some time, this causes OOM for the RSs. Upon further investigation, I've found out that we end up with 200 regions, each having 3-4 store files open. Under hadoop2 SSR, BlockReaderLocal allocates DirectBuffers, which is unlike HDFS 1 where there is no direct buffer allocation. It seems that there is no guards against the memory used by local buffers in hdfs 2, and having a large number of open files causes multiple GB of memory to be consumed from the RS process. This issue is to further investigate what is going on. Whether we can limit the memory usage in HDFS, or HBase, and/or document the setup. Possible mitigation scenarios are: - Turn off SSR for Hadoop 2 - Ensure that there is enough unallocated memory for the RS based on expected # of store files - Ensure that there is lower number of regions per region server (hence number of open files) Stack trace: {code} org.apache.hadoop.hbase.DroppedSnapshotException: region: IntegrationTestLoadAndVerify,yC^P\xD7\x945\xD4,1363388517630.24655343d8d356ef708732f34cfe8946. at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1560) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1439) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1380) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:449) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:215) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:63) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:237) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:632) at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:97) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.hdfs.util.DirectBufferPool.getBuffer(DirectBufferPool.java:70) at org.apache.hadoop.hdfs.BlockReaderLocal.init(BlockReaderLocal.java:315) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:208) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:790) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:455) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:645) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:689) at java.io.DataInputStream.readFully(DataInputStream.java:178) at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:312) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:543) at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:589) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1261) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:512) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:603) at org.apache.hadoop.hbase.regionserver.Store.validateStoreFile(Store.java:1568) at org.apache.hadoop.hbase.regionserver.Store.commitFile(Store.java:845) at org.apache.hadoop.hbase.regionserver.Store.access$500(Store.java:109) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.commit(Store.java:2209) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1541) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly,