[ https://issues.apache.org/jira/browse/DIRMINA-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Emmanuel Lecharny resolved DIRMINA-779. --------------------------------------- Resolution: Fixed Using a lock instead of a synchronized (this) does not break the ssltest, and should provide the right protection against concurrent access. Fixed with commit 56a6e58004ea4a6af5640f03d1f6796a073c5d62 > SSLHandler can re-order data that it reads > ------------------------------------------ > > Key: DIRMINA-779 > URL: https://issues.apache.org/jira/browse/DIRMINA-779 > Project: MINA > Issue Type: Bug > Components: Filter > Affects Versions: 1.1.7, 2.0.0-M1, 2.0.0-M2, 2.0.0-M3, 2.0.0-M4, 2.0.0-M5, > 2.0.0-M6, 2.0.0-RC1 > Reporter: Jason Resch > Fix For: 2.0.8 > > Attachments: ssl_reodering_fix.diff > > > The code in question is the flushScheduledEvents() method in SSLHandler.java: > {code} > public void flushScheduledEvents() { > // Fire events only when no lock is hold for this handler. > if (Thread.holdsLock(this)) { > return; > } > Event e; > // We need synchronization here inevitably because filterWrite can be > // called simultaneously and cause 'bad record MAC' integrity error. > synchronized (this) { > while ((e = filterWriteEventQueue.poll()) != null) { > e.nextFilter.filterWrite(session, (WriteRequest) e.data); > } > } > while ((e = messageReceivedEventQueue.poll()) != null) { > e.nextFilter.messageReceived(session, e.data); > } > } > {code} > This method is called both by threads which handle writes, and threads that > handle reads. Therefore, as the comments suggest, multiple threads may go > through this code simultaneously. However, since there is no > synchronization around processing of the messageReceivedEventQueue, it is > possible that the received messages will be sent to the next filter out of > order, should there be more than one message in the queue and a context > switch happen at the wrong time. > The bug would manifest in our application as a failure of our protocol layer > to decode a message, we believe, due to a re-ordering. It only occurred in > environments with a large amount of contention and network traffic and when > using TLS. The fix I have tested was to move the closing brace of the > synchronized block to extend to cover both while loops. I've attached a > patch representing that change. Since making that change we have not > encountered the bug again after about 30 hours of testing and 1.5 TB of > traffic, whereas before the change we could reproduce it after a few > minutes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)