[ https://issues.apache.org/jira/browse/KAFKA-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196767#comment-15196767 ]
Ewen Cheslack-Postava commented on KAFKA-3290: ---------------------------------------------- This was reopened because it seemed to be triggered very infrequently, but I'm actually encountering something a bit different but which might also be related. I manage to grab a thread dump at (approximately) the point of failure, and I think this is the relevant piece: {quote} "pool-3-thread-1" java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) at org.apache.kafka.connect.runtime.WorkerSourceTask.commitOffsets(WorkerSourceTask.java:284) at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:159) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:126) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {quote} Triggering this seems to require parallel test execution and there must be at least a bit of load on my CPU. I'm not sure the output is perfect because during some of the test runs it seemed like the stdout reporting may be somehow broken by parallel test execution (e.g. some output seemed cut off at the wrong place). It looks like the accounting for outstanding records might not be handled properly because a commit that happens when the SourceTask is being stopped ends up waiting for producer callbacks for some outstanding records. However, I'm skeptical of that assessment because even adding some debugging printouts to stdout is not showing any records being produced in the stdout of the failed tests.... [~hachikuji] I found this while trying to merge KAFKA-3394, which is obviously unrelated. Seems like as soon as I merged [~jcustenborder]'s commitRecord patch to trunk it decided to start failing often for me... I think these two issues might have the same cause (some threading/timing related issue). > WorkerSourceTask testCommit transient failure > --------------------------------------------- > > Key: KAFKA-3290 > URL: https://issues.apache.org/jira/browse/KAFKA-3290 > Project: Kafka > Issue Type: Sub-task > Components: copycat > Reporter: Jason Gustafson > Assignee: Jason Gustafson > Fix For: 0.10.0.0 > > > From recent failed build: > {code} > org.apache.kafka.connect.runtime.WorkerSourceTaskTest > testCommit FAILED > java.lang.AssertionError: > Expectation failure on verify: > Listener.onStartup(job-0): expected: 1, actual: 1 > Listener.onShutdown(job-0): expected: 1, actual: 1 > at org.easymock.internal.MocksControl.verify(MocksControl.java:225) > at > org.powermock.api.easymock.internal.invocationcontrol.EasyMockMethodInvocationControl.verify(EasyMockMethodInvocationControl.java:132) > at org.powermock.api.easymock.PowerMock.verify(PowerMock.java:1466) > at org.powermock.api.easymock.PowerMock.verifyAll(PowerMock.java:1405) > at > org.apache.kafka.connect.runtime.WorkerSourceTaskTest.testCommit(WorkerSourceTaskTest.java:221) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)