On 28/03/17 00:30, Robert Lewis wrote:
> Hi,
> 
> I am tracking down a fairly sporadic bug in our software that uses Tomcat
> 8.0.38. Long story short, sometimes calls to Basic.sendBinary() to a full
> buffer then to a small buffer (eg. 8192x3 then 444 bytes). The first 8192
> sends will succeed and occasionally we see the last 444 byte send 'fail' in
> a way that we never see it leave the network, resulting in the client
> waiting for bytes and eventually timing out. We notice that if we close the
> the connection remotely, the bytes immediately get sent.
> 
> This led me to believe something was not getting flushed properly. This URL
> indicates that there were some recent conversations about something similar:
> 
> http://tomcat.10.x6.nabble.com/Tomcat-WebSocket-does-not-
> always-send-asynchronous-messages-td5060965.html
> 
> I decided to dig further and tried to send a ping between sending bytes, it
> seems to alleviate the problem, but still doesn't tell me what is going on.
> Taking a suggestion from Mark T.  \around a 'possible race condition in the
> web socket code', I debugged through tomcat code looking for race
> conditions, and immediately a source file and function (doWrite()) stood
> out, it is modifying state then calling to another public function to act
> on that state:
> 
> https://github.com/apache/tomcat/blob/TOMCAT_8_0_0_RC10/
> java/org/apache/tomcat/websocket/server/WsRemoteEndpointImplServer.java

That is quite an old version of the code. You'd be better off looking at
the master copy which is in svn at the ASF:

https://svn.apache.org/viewvc/tomcat/tc8.5.x/trunk/java/org/apache/tomcat/websocket/server/WsRemoteEndpointImplServer.java?view=log

> Further up, the doWrite() caller in the endpoint was moved out of a sync
> block to prevent a deadlock (there was a specific comment around this),
> which leads me to believe that something was calling doWrite() on multiple
> threads, but I have not tracked that down yet.

This is part of the long story to do with trying to run on the Servlet
3.1 API (see below). For full details see:
http://people.apache.org/~markt/presentations/2013-09-XX-WebSocket-on-Servlet31.pdf

starting around slide 37.

The short version is that it is possible that there are two threads
trying to write in parallel:
- one 'write' thread responding to the poller
- one 'read' thread in application code that is trying to write a
  response to what it has just read

> Anyway, there was a recent code change on the 8.5.x series to the doWrite()
> implementation which checks to see if it is a blocking call, then sends
> immediately to the socket and flushes without class level state. I have not
> tested this yet to see if it solves the issue as we are tied to 8.0.x for
> now, but working on migrating our code to work with 8.5.x.
> 
> Most of the work on the files seem to be done by Mark T. (awesome work, we
> rely on this functionality heavily!)

Flattery always welcome ;)

> so I figured I would reach out and ask
> about the doWrite() change to have a else block for blocking sockets. Is
> this intended to fix the issue I am describing above?

No. The major change between 8.0.x and 8.5.x was giving up on the idea
of building WebSocket solely on top  of the Servlet API. To cut a long
story short, it isn't possible. This is primarily because WebSocket
requires a connection to switch back and forth between blocking and
non-blocking but Servlet 3.1 non-blocking I/O doesn't allow upgraded
connections to do that.

The else block was the result of being able to write simpler code once
we bypassed the Servlet API and went directly to Tomcat's low level I/O API.

> I would check the history but I cannot seem to find the source for the
> initial commit that introduces the else block for 8.5.x.

https://svn.apache.org/viewvc?view=revision&revision=1662698

You might want to explore the test case I wrote for the previous report:
https://svn.apache.org/viewvc/tomcat/trunk/test/org/apache/tomcat/websocket/server/TestAsyncMessages.java?view=log

If you can modify that so it fails at least some of the time - ideally
all the time - fixing it should be fairly simple.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to