[ 
https://issues.apache.org/jira/browse/HBASE-16973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15621385#comment-15621385
 ] 

Yu Li commented on HBASE-16973:
-------------------------------

bq. You mean when set -1 explicitly treat as no caching? #1 says make the 
default as 128 so I dont think u mean to make def as no caching.
>From the comment of the code it should be treat as no caching when set -1, but 
>actually it won't. In {{HTable#getScanner}} of our current master branch code 
>we have:
{code}
    if (scan.getCaching() <= 0) {
      scan.setCaching(scannerCaching);
    }
{code}
And this {{scannerCaching}} is initialized as 
{{connConfiguration.getScannerCaching()}} then:
{code}
    this.scannerCaching = conf.getInt(
      HConstants.HBASE_CLIENT_SCANNER_CACHING, 
HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING);
{code}
So with {{Scan.setCaching(-1)}} we will have caching as Integer.MAX_VALUE. I 
checked our [hbase book|http://hbase.apache.org/book.html] and there it already 
talks about default will be Integer.MAX_VALUE, so for this part I guess we 
should update the below comment of code to what it actually will be to avoid 
confusion:
{code}
  /*
   * -1 means no caching
   */
  private int caching = -1;
{code}

bq. So still there is no timeouts happening because of the partial result 
return stuff and/or heart beat. Correct?
Correct, it runs for 24s and timeout set to 1min so no timeouts, and no partial 
result

> Revisiting default value for hbase.client.scanner.caching
> ---------------------------------------------------------
>
>                 Key: HBASE-16973
>                 URL: https://issues.apache.org/jira/browse/HBASE-16973
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Yu Li
>            Assignee: Yu Li
>
> We are observing below logs for a long-running scan:
> {noformat}
> 2016-10-30 08:51:41,692 WARN  
> [B.defaultRpcServer.handler=50,queue=12,port=16020] ipc.RpcServer:
> (responseTooSlow-LongProcessTime): {"processingtimems":24329,
> "call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)",
> "client":"11.251.157.108:50415","scandetails":"table: ae_product_image 
> region: ae_product_image,494:
> ,1476872321454.33171a04a683c4404717c43ea4eb8978.","param":"scanner_id: 
> 5333521 number_of_rows: 2147483647
> close_scanner: false next_call_seq: 8 client_handles_partials: true 
> client_handles_heartbeats: true",
> "starttimems":1477788677363,"queuetimems":0,"class":"HRegionServer","responsesize":818,"method":"Scan"}
> {noformat}
> From which we found the "number_of_rows" is as big as {{Integer.MAX_VALUE}}
> And we also observed a long filter list on the customized scan. After 
> checking application code we confirmed that there's no {{Scan.setCaching}} or 
> {{hbase.client.scanner.caching}} setting on client side, so it turns out 
> using the default value the caching for Scan will be Integer.MAX_VALUE, which 
> is really a big surprise.
> After checking code and commit history, I found it's HBASE-11544 which 
> changes {{HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING}} from 100 to 
> Integer.MAX_VALUE, and from the release note there I could see below notation:
> {noformat}
> Scan caching default has been changed to Integer.Max_Value 
> This value works together with the new maxResultSize value from HBASE-12976 
> (defaults to 2MB) 
> Results returned from server on basis of size rather than number of rows 
> Provides better use of network since row size varies amongst tables
> {noformat}
> And I'm afraid this lacks of consideration of the case of scan with filters, 
> which may involve many rows but only return with a small result.
> What's more, we still have below comment/code in {{Scan.java}}
> {code}
>   /*
>    * -1 means no caching
>    */
>   private int caching = -1;
> {code}
> But actually the implementation does not follow (instead of no caching, we 
> are caching {{Integer.MAX_VALUE}}...).
> So here I'd like to bring up two points:
> 1. Change back the default value of 
> HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING to some small value like 128
> 2. Reenforce the semantic of "no caching"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to