Implement scanner caching throttling to prevent too big responses ------------------------------------------------------------------
Key: HBASE-5607 URL: https://issues.apache.org/jira/browse/HBASE-5607 Project: HBase Issue Type: Improvement Reporter: Ferdy Galema When a misconfigured client retrieves fat rows with a scanner caching value set too high, there is a big chance the regionserver cannot handle the response buffers. (See log example below). Also see the mailing list thread: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/24819 This issue is for tracking a solution that throttles the scanner caching value in the case the response buffers are too big. A few possible solutions: a) Is a response (repeatedly) over 100MB (configurable), then reduce the scanner-caching by half its size. (In either server or client). b) Introduce a property that defines a fixed (target) response size, instead of defining the numbers of rows to cache. 2012-03-20 07:57:40,092 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60020, responseTooLarge for: next(4438820558358059204, 1000) from 172.23.122.15:50218: Size: 105.0m 2012-03-20 07:57:53,226 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 60020, responseTooLarge for: next(-7429189123174849941, 1000) from 172.23.122.15:50218: Size: 214.4m 2012-03-20 07:57:57,839 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60020, responseTooLarge for: next(-7429189123174849941, 1000) from 172.23.122.15:50218: Size: 103.2m 2012-03-20 07:57:59,442 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60020, responseTooLarge for: next(-7429189123174849941, 1000) from 172.23.122.15:50218: Size: 101.8m 2012-03-20 07:58:20,025 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60020, responseTooLarge for: next(9033159548564260857, 1000) from 172.23.122.15:50218: Size: 107.2m 2012-03-20 07:58:27,273 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 60020, responseTooLarge for: next(9033159548564260857, 1000) from 172.23.122.15:50218: Size: 100.1m 2012-03-20 07:58:52,783 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60020, responseTooLarge for: next(-8611621895979000997, 1000) from 172.23.122.15:50218: Size: 101.7m 2012-03-20 07:59:02,541 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60020, responseTooLarge for: next(-511305750191148153, 1000) from 172.23.122.15:50218: Size: 120.9m 2012-03-20 07:59:25,346 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60020, responseTooLarge for: next(1570572538285935733, 1000) from 172.23.122.15:50218: Size: 107.8m 2012-03-20 07:59:46,805 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 60020, responseTooLarge for: next(-727080724379055435, 1000) from 172.23.122.15:50218: Size: 102.7m 2012-03-20 08:00:00,138 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 60020, responseTooLarge for: next(-3701270248575643714, 1000) from 172.23.122.15:50218: Size: 122.1m 2012-03-20 08:00:21,232 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60020, responseTooLarge for: next(5831907615409186602, 1000) from 172.23.122.15:50218: Size: 157.5m 2012-03-20 08:00:23,199 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 9 on 60020, responseTooLarge for: next(5831907615409186602, 1000) from 172.23.122.15:50218: Size: 160.7m 2012-03-20 08:00:28,174 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60020, responseTooLarge for: next(5831907615409186602, 1000) from 172.23.122.15:50218: Size: 160.8m 2012-03-20 08:00:32,643 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60020, responseTooLarge for: next(5831907615409186602, 1000) from 172.23.122.15:50218: Size: 182.4m 2012-03-20 08:00:36,826 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 9 on 60020, responseTooLarge for: next(5831907615409186602, 1000) from 172.23.122.15:50218: Size: 237.2m 2012-03-20 08:00:40,850 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 60020, responseTooLarge for: next(5831907615409186602, 1000) from 172.23.122.15:50218: Size: 212.7m 2012-03-20 08:00:44,736 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60020, responseTooLarge for: next(5831907615409186602, 1000) from 172.23.122.15:50218: Size: 232.9m 2012-03-20 08:00:49,471 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60020, responseTooLarge for: next(5831907615409186602, 1000) from 172.23.122.15:50218: Size: 227.2m 2012-03-20 08:00:57,566 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server serverName=c15.kalooga.nl,60020,1331900161295, load=(requests=706, regions=166, usedHeap=1505, maxHeap=1995): OutOfMemoryError, aborting java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39) at java.nio.ByteBuffer.allocate(ByteBuffer.java:312) at org.apache.hadoop.hbase.ipc.ByteBufferOutputStream.<init>(ByteBufferOutputStream.java:44) at org.apache.hadoop.hbase.ipc.ByteBufferOutputStream.<init>(ByteBufferOutputStream.java:37) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1064) 2012-03-20 08:00:57,567 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: requests=189, regions=166, stores=987, storefiles=1401, storefileIndexSize=222, memstoreSize=693, compactionQueueSize=0, flushQueueSize=0, usedHeap=1635, maxHeap=1995, blockCacheSize=315474864, blockCacheFree=103051152, blockCacheCount=4422, blockCacheHitCount=1152306, blockCacheMissCount=12464451, blockCacheEvictedCount=5585715, blockCacheHitRatio=8, blockCacheHitCachingRatio=16 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira