[ 
https://issues.apache.org/jira/browse/KYLIN-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15619993#comment-15619993
 ] 

Dayue Gao commented on KYLIN-2079:
----------------------------------

[~mahongbin] , the way Kylin avoids retrying coprocessor call in these cases is 
by response successfully with flag normalComplete set to false, not by throwing 
DoNotRetryException. We just have to response before hbase.rpc.timeout, this is 
why I make upper bound of kylin.query.coprocessor.timeout.seconds to 
hbase.rpc.timeout x 0.9. I have tried using DoNotRetryException before I 
realized this fact.

> add explicit configuration knob for coprocessor timeout
> -------------------------------------------------------
>
>                 Key: KYLIN-2079
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2079
>             Project: Kylin
>          Issue Type: Sub-task
>          Components: Storage - HBase
>    Affects Versions: v1.5.4.1
>            Reporter: Dayue Gao
>            Assignee: Dayue Gao
>             Fix For: v1.6.0
>
>         Attachments: KYLIN-2079.patch
>
>
> Current self-termination timeout for CubeVisitService is calculated as the 
> product of three parameters:
> * hbase.rpc.timeout
> * hbase.client.retries.number (hardcode to 5)
> * kylin.query.cube.visit.timeout.times
> It has a few problems:
> # due to this timeout being longer than hbase.rpc.timeout, user sees "Error 
> in coprocessor" instead of more descriptive GTScanSelfTerminatedException. 
> moreover, the request (probably a bad query) will be retried 5 times, 
> increasing pressure on regionserver
> # it's not intuitive to set coprocessor timeout by adjusting 
> kylin.query.cube.visit.timeout.times
> I propose the following changes:
> # add a new kylin configuration "kylin.query.coprocessor.timeout.seconds" to 
> explicitly set coprocessor timeout. It defaults to 0, which means no value, 
> use hbase.rpc.timeout x 0.9 instead. When user sets it to a positive number, 
> kylin will use min(hbase.rpc.timeout x 0.9, 
> kylin.query.coprocessor.timeout.seconds) as coprocessor timeout
> # remove "kylin.query.cube.visit.timeout.times". For cube visit timeout 
> (ExpectedSizeIterator), it's really a last resort, in case coprocessor didn't 
> terminate itself. I don't see too much needs for user to control it, set it 
> to coprocessor timeout x 10 should be a large enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to