[
https://issues.apache.org/jira/browse/HBASE-16973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15669671#comment-15669671
]
Nick Dimiduk commented on HBASE-16973:
--------------------------------------
Trying to understand the state of things here for 1.1. Looks like HBASE-11544
made it, meaning {{DEFAULT_HBASE_CLIENT_SCANNER_CACHING = Integer.MAX_VALUE}};
thus the default limit based on total number of rows is effectively unbounded.
We also have HBASE-12976, so {{DEFAULT_HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE = 2
* 1024 * 1024}}. {{hbase.client.scanner.timeout.period}} is 1m in
hbase-defaults.xml. This means for a highly selective filter, we'd end up
hitting a timeout and throwing away any partial results before the 2mb is
filled? Or does it mean we go back to the client after 1m with whatever we've
accumulated so far? The former is a pretty bad situation and warrants some
comment about the sharp edge. I'm against changing the default this late into
the maintenance cycle, but a table in the book that breaks things out by
release branch would help users stumbling through the mirk.
> Revisiting default value for hbase.client.scanner.caching
> ---------------------------------------------------------
>
> Key: HBASE-16973
> URL: https://issues.apache.org/jira/browse/HBASE-16973
> Project: HBase
> Issue Type: Task
> Reporter: Yu Li
> Assignee: Yu Li
> Attachments: Scan.next_p999.png
>
>
> We are observing below logs for a long-running scan:
> {noformat}
> 2016-10-30 08:51:41,692 WARN
> [B.defaultRpcServer.handler=50,queue=12,port=16020] ipc.RpcServer:
> (responseTooSlow-LongProcessTime): {"processingtimems":24329,
> "call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)",
> "client":"11.251.157.108:50415","scandetails":"table: ae_product_image
> region: ae_product_image,494:
> ,1476872321454.33171a04a683c4404717c43ea4eb8978.","param":"scanner_id:
> 5333521 number_of_rows: 2147483647
> close_scanner: false next_call_seq: 8 client_handles_partials: true
> client_handles_heartbeats: true",
> "starttimems":1477788677363,"queuetimems":0,"class":"HRegionServer","responsesize":818,"method":"Scan"}
> {noformat}
> From which we found the "number_of_rows" is as big as {{Integer.MAX_VALUE}}
> And we also observed a long filter list on the customized scan. After
> checking application code we confirmed that there's no {{Scan.setCaching}} or
> {{hbase.client.scanner.caching}} setting on client side, so it turns out
> using the default value the caching for Scan will be Integer.MAX_VALUE, which
> is really a big surprise.
> After checking code and commit history, I found it's HBASE-11544 which
> changes {{HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING}} from 100 to
> Integer.MAX_VALUE, and from the release note there I could see below notation:
> {noformat}
> Scan caching default has been changed to Integer.Max_Value
> This value works together with the new maxResultSize value from HBASE-12976
> (defaults to 2MB)
> Results returned from server on basis of size rather than number of rows
> Provides better use of network since row size varies amongst tables
> {noformat}
> And I'm afraid this lacks of consideration of the case of scan with filters,
> which may involve many rows but only return with a small result.
> What's more, we still have below comment/code in {{Scan.java}}
> {code}
> /*
> * -1 means no caching
> */
> private int caching = -1;
> {code}
> But actually the implementation does not follow (instead of no caching, we
> are caching {{Integer.MAX_VALUE}}...).
> So here I'd like to bring up two points:
> 1. Change back the default value of
> HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING to some small value like 128
> 2. Reenforce the semantic of "no caching"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)