[
https://issues.apache.org/jira/browse/DL-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15763511#comment-15763511
]
ASF GitHub Bot commented on DL-145:
-----------------------------------
GitHub user xieliang opened a pull request:
https://github.com/apache/incubator-distributedlog/pull/78
DL-145 : the write requests should be error out immediately even if the
rolling writer still be creating
Passed all test cases locally, now
TestDistributedLogService#testServiceTimeout case is stable on my box
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/xieliang/incubator-distributedlog DL-145
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-distributedlog/pull/78.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #78
----
commit 6be8ca4d01c4c40947d4b901f0299c8dcc97c509
Author: xieliang <[email protected]>
Date: 2016-12-20T07:19:38Z
the write requests should be error out immediately even if the rolling
writer still be creating
----
> Fix the flaky testServiceTimeout
> --------------------------------
>
> Key: DL-145
> URL: https://issues.apache.org/jira/browse/DL-145
> Project: DistributedLog
> Issue Type: Test
> Components: distributedlog-service
> Affects Versions: 0.4.0
> Reporter: Liang Xie
> Assignee: Liang Xie
>
> The TestDistributedLogService#testServiceTimeout case is not stable, e.g.
> https://builds.apache.org/job/distributedlog-precommit-pullrequest/22/com.twitter$distributedlog-service/testReport/com.twitter.distributedlog.service/TestDistributedLogService/testServiceTimeout/
> It could be reproduced on my box occasionally, and the failures were stable
> if i tuned the ServiceTimeoutMs from 200 to 150, and always passed if tuned
> to a larger value, e.g. 1000(btw, my disk is SSD type)
> After digging into it, shows it related with starting a new log segment
> corner case.
> For a good case, once service time out occurs, steam status : ERROR ->
> CLOSING -> CLOSED, calling Abortables.asyncAbort will trigger the cached
> logsegment be aborted, then writeOp will be injected an exception, e.g. write
> cancel exception.
> For a bad case, since no log records be written before, so there'll be an
> async start new log segment, once the timeout occurs, the segment starting
> still not be done, so no cache, then asyncAbort has no change to abort that
> segment.
> I think change the test timeout value to a larger one should be fine for this
> special test corner case.
> will attach a minor patch later. Any suggestions are welcome.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)