[
https://issues.apache.org/jira/browse/QPID-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426699#comment-15426699
]
Lorenz Quack commented on QPID-7368:
------------------------------------
I am not convinced the above analysis is correct because in the attached log I
am missing the DEBUG log "Cancelling selector on accepting port" that should
come from the anonymous Runnable in {{SelectorThread#cancelAcceptingSocket}}
which is called from {{NonBlockingNetworkTransport#close}} before the
selections key is cancelled in the code path described above.
To me it seems there are two races. One in the vicinity of what was described
above and another which I think might be responsible for this particular
failure:
# The 0-th SelectorThread (which is responsible for accepting) calls
{{select}} and before it can process the selected keys another thread calls
{{runTasks}} and cancels the key. In this case, I don't see how deregistering
{{OP_ACCEPT}} before cancelling it would help.
# {{NetworkConnectionScheduler#processConnection}} calls
{{NetworkConnectionScheduler#removeConnection}} and
{{NonBlockingConnection#doWork}} which both end up calling
{{SelectorThread#unregisterConnection}} which also cancels the selection key.
Note, that in {{SelectionTask#performSelect}} we add {{this}} to the
{{_workQueue}} before we start calling
{{NetworkConnectionScheduler#processConnection}} via the
{{ConnectionProcessors}} This means another {{SelectorThread}} could pick up
this {{SelectionTask}} from the {{_workQueue}} and call
{{SelectionTask#processSelectionKeys}} racing with the cancellation of the
selection key.
So the events would be as follows:
* The test sends the invalid header.
* A selectorTask detects that the connection is selectable and puts it into
(wrapped into a {{ConnectionProcessor}}) into the connections list for further
processing.
* The same task adds itself in addition to adding the ConnectionProcessor to
the {{_workQueue}}
* One SelectorThread "IO-pool-Port-amqp-2" picks up the ConnectionProcessor
work at 17:29:07,884 while the original SelectorTask goes into another round of
select.
* In the processing thread from {{NonBlockingConnection#doWork}},
{{NonBlockingConnection#doRead}} is called which in turn calls
{{MultiVersionProtocolEngine.SelfDelegateProtocolEngine#received}} where it
writes the response and closes the {{NonBlockingConnection}} at 17:29:07,886
* {{NonBlockingConnection#close}} sets the {{_closed}} flag so that when the
{{NonBlockingConnection#doWork}} regains control it will call {{shutdown}} near
the end of the method. This ends up calling
{{SelectorThread#unregisterConnection}} which cancels the selection key for
this connection.
* In the meantime the original selectionTask is picked up by a
SelectorThread and calls select which will return the selection key for the
connection since it is ready for reading/writing. Before it can call
{{processSelectionKeys}} the above sequence happens invalidating the key. When
{{processSelectionKeys}} calls {{isAcceptable}} at 17:29:07,898 it blows up.
At least that is my hypothesis.
> CancelledKeyException from accepting socket during Broker shutdown
> ------------------------------------------------------------------
>
> Key: QPID-7368
> URL: https://issues.apache.org/jira/browse/QPID-7368
> Project: Qpid
> Issue Type: Bug
> Components: Java Broker
> Reporter: Keith Wall
> Fix For: qpid-java-6.1
>
> Attachments:
> TEST-org.apache.qpid.transport.ProtocolNegotiationTest.testWrongProtocolHeaderSent_BrokerRespondsWithSupportedProtocol.txt
>
>
> We saw the following test failure on trunk whilst the test was shutting down
> a Broker.
> https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-Java-Java-Test-JDK1.8/lastCompletedBuild/testReport/org.apache.qpid.transport/ProtocolNegotiationTest/testWrongProtocolHeaderSent_BrokerRespondsWithSupportedProtocol/
> Thread terminated due to uncaught exception
> {noformat}
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isAcceptable(SelectionKey.java:360)
> at
> org.apache.qpid.server.transport.SelectorThread$SelectionTask.processSelectionKeys(SelectorThread.java:167)
> at
> org.apache.qpid.server.transport.SelectorThread$SelectionTask.performSelect(SelectorThread.java:305)
> at
> org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(SelectorThread.java:87)
> at
> org.apache.qpid.server.transport.SelectorThread.run(SelectorThread.java:462)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]