https://bz.apache.org/bugzilla/show_bug.cgi?id=57943

--- Comment #5 from Christopher Schultz <ch...@christopherschultz.net> ---
Comment on attachment 32880
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=32880
analysis for bug57943

Ian's comment, not as an attachment, and therefore much more readable, is:

I am the colleague of Sun Qi who reported
[bug57943](https://bz.apache.org/bugzilla/show_bug.cgi?id=57943) one month ago.
I believe we managed to find out why ConcurrentModificationException happens in
our scenario.

Here's the scenario, our application uses async servlet to write the response
back to the client. The operation we put in async servlet which run in another
thread is quite time-consuming. When the incoming traffic is heavy, which
usually happens when the instances of tomcat server reboots, we hit this issue
very easily. 

When the ClientPoller thread run into timeout method (in NioEndpoint) and
execute the following code:

```java
} else if (!ka.isAsync() || ka.getTimeout() > 0) {
    // Async requests with a timeout of 0 or less never timeout
    long delta = now - ka.getLastAccess();
    long timeout =
(ka.getTimeout()==-1)?((long)socketProperties.getSoTimeout()):(ka.getTimeout());
    boolean isTimedout = delta > timeout;
    if (isTimedout) {
        // Prevent subsequent timeouts if the timeout event takes a while to
process
        ka.access(Long.MAX_VALUE);
        processSocket(ka.getChannel(), SocketStatus.TIMEOUT, true, 6);
    }
}
```

Eventually it calls `processSocket(ka.getChannel())` and puts back the
NioChannel instance to the cache `nioChannels` for the late re-use:

```java
if (ka!=null) ka.setComet(false);
socket.getPoller().cancelledKey(key, SocketStatus.ERROR, false);
if (running && !paused) {
    nioChannels.offer(socket);
}
```

Unfortunately in our scenarios, when QPS is greater than 1000req/sec, async
servlet threads have very large chance to start async-complete and begin to
call `processSocket` (in Http11NioProcessor) at the same time:

```java
} else if (actionCode == ActionCode.ASYNC_COMPLETE) {
    if (asyncStateMachine.asyncComplete()) {
    ((NioEndpoint)endpoint).processSocket(socketWrapper.getSocket(),
SocketStatus.OPEN_READ, true, 10);
}
```

This leads to cache pollution in `nioChannels` since there're chances to offer
the same object into the cache multiple times. When the Acceptor thread polls
the object and hands it over to the ClientPollers, the same object could be
pass into the different ClientPoller threads, then leads to very serious
problem.

You may notice the code I show here is the code from tomcat 7.0.54, and claim
the similar issue has already been addressed, but this is not true. We noticed
the similar issue was reported on
[Bug57340](https://bz.apache.org/bugzilla/show_bug.cgi?id=57340), and the
solution has already been checked into the later release.

But this solution is not a complete solution, in our scenario, there's chance
to offer the duplicated object to the cache in `else if` clause where the
solution for Bug57340 doesn't cover:

```java
} else if (handshake == -1 ) {
    if (key != null) {
        socket.getPoller().cancelledKey(key, SocketStatus.DISCONNECT, false);
    }
    nioChannels.offer(socket);
    socket = null;
    if ( ka!=null ) keyCache.offer(ka);
    ka = null;
}
```

This is easily to understand. When handshake equals -1, one possibility is the
`key` passed into the doRun method is `null`, and when `key` is null, it means
some other thread has already finished processing on the same socket:

```java
public void run() {
    SelectionKey key = socket.getIOChannel().keyFor(
        socket.getPoller().getSelector());
```

We propose the change below in order to address this issue throughly. In this
case, I think we can simply drop the object instead of offering it since it
looks there's no other ideal way to not use lock.

```java
} else if (handshake == -1 ) {
    if (key != null) {
        if (socket.getPoller().cancelledKey(key, SocketStatus.DISCONNECT,
false) != null) {
            nioChannels.offer(socket);
            if (ka != null) keyCache.offer(ka);
        }
    }
    socket = null;
    ka = null;
```

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to