[ https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075782#comment-14075782 ]
Lars Hofhansl commented on HBASE-11544: --------------------------------------- The user just wants to call scanner.next(), which returns a row. That's it. Everything else is leaking performance concerns that HBase should handle automatically into the API. In the past we mixed the performance concern (too many RPCs for small rows) in with the API (set number of rows). That was obviously a quick fix (mistake). The optimal bytes/RPC ratio is a function of the network bandwidth and latency. The only extra API needed on the client is setBatch() so that a client application has a way to deal with rows too large for *it* to handle. bq. what if response data size is smaller than a chunk? It just sends what it has. I am really not proposing anything different from networking protocols have been doing for 30 years :) > [Ergonomics] hbase.client.scanner.caching is dogged and will try to return > batch even if it means OOME > ------------------------------------------------------------------------------------------------------ > > Key: HBASE-11544 > URL: https://issues.apache.org/jira/browse/HBASE-11544 > Project: HBase > Issue Type: Bug > Reporter: stack > Labels: noob > > Running some tests, I set hbase.client.scanner.caching=1000. Dataset has > large cells. I kept OOME'ing. > Serverside, we should measure how much we've accumulated and return to the > client whatever we've gathered once we pass out a certain size threshold > rather than keep accumulating till we OOME. -- This message was sent by Atlassian JIRA (v6.2#6252)