[ https://issues.apache.org/jira/browse/HBASE-16973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624231#comment-15624231 ]
Yu Li commented on HBASE-16973: ------------------------------- Thanks for chiming in [~enis] Yes we have 3 kinds of limit for scan and the rows limit is removed by default after HBASE-11544. I'm convinced to keep the default as is for branch-1.1+, but this indeed is a behavior change from 0.98 to 1.x and requires user to specifically set {{hbase.client.scanner.caching}} in some cases, like our case the scan.next p999 latency increased from seconds to minutes w/ the default value... It's a bad user experience since application unchanged but performance downgrades... > Revisiting default value for hbase.client.scanner.caching > --------------------------------------------------------- > > Key: HBASE-16973 > URL: https://issues.apache.org/jira/browse/HBASE-16973 > Project: HBase > Issue Type: Bug > Reporter: Yu Li > Assignee: Yu Li > Attachments: Scan.next_p999.png > > > We are observing below logs for a long-running scan: > {noformat} > 2016-10-30 08:51:41,692 WARN > [B.defaultRpcServer.handler=50,queue=12,port=16020] ipc.RpcServer: > (responseTooSlow-LongProcessTime): {"processingtimems":24329, > "call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)", > "client":"11.251.157.108:50415","scandetails":"table: ae_product_image > region: ae_product_image,494: > ,1476872321454.33171a04a683c4404717c43ea4eb8978.","param":"scanner_id: > 5333521 number_of_rows: 2147483647 > close_scanner: false next_call_seq: 8 client_handles_partials: true > client_handles_heartbeats: true", > "starttimems":1477788677363,"queuetimems":0,"class":"HRegionServer","responsesize":818,"method":"Scan"} > {noformat} > From which we found the "number_of_rows" is as big as {{Integer.MAX_VALUE}} > And we also observed a long filter list on the customized scan. After > checking application code we confirmed that there's no {{Scan.setCaching}} or > {{hbase.client.scanner.caching}} setting on client side, so it turns out > using the default value the caching for Scan will be Integer.MAX_VALUE, which > is really a big surprise. > After checking code and commit history, I found it's HBASE-11544 which > changes {{HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING}} from 100 to > Integer.MAX_VALUE, and from the release note there I could see below notation: > {noformat} > Scan caching default has been changed to Integer.Max_Value > This value works together with the new maxResultSize value from HBASE-12976 > (defaults to 2MB) > Results returned from server on basis of size rather than number of rows > Provides better use of network since row size varies amongst tables > {noformat} > And I'm afraid this lacks of consideration of the case of scan with filters, > which may involve many rows but only return with a small result. > What's more, we still have below comment/code in {{Scan.java}} > {code} > /* > * -1 means no caching > */ > private int caching = -1; > {code} > But actually the implementation does not follow (instead of no caching, we > are caching {{Integer.MAX_VALUE}}...). > So here I'd like to bring up two points: > 1. Change back the default value of > HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING to some small value like 128 > 2. Reenforce the semantic of "no caching" -- This message was sent by Atlassian JIRA (v6.3.4#6332)