Re: RFR JDK-8087113: Websocket API and implementation

Pavel Rappo Mon, 04 Apr 2016 07:04:24 -0700

Hi Simone, thanks for deep dive into this!

> On 3 Apr 2016, at 18:43, Simone Bordet <simone.bor...@gmail.com> wrote:
> 
> Throwing exceptions.
> ---
> Given that the API is making use of CompletableFuture, throwing
> exceptions instead is in my experience not the right thing to do.
> You want to communicate failure via the CompletableFuture rather than
> throwing exceptions.
> For example, the WebSocket.sendXXX() methods throw a variety of
> exceptions, mostly IllegalStateExceptions, for reasons that may not be
> dependent on the application using the API (the connection is closed,
> an error occurred, the permits are exhausted, etc.).
> Throwing exceptions will make the API cumbersome to use because
> applications have to do:
> 
> try {
>    CF cf = websocket.sendText(...);
> } catch (Exception x) {
>    // What here ?
> }
> 
> When using async API, you don't want to catch exceptions anymore,
> since the (only) mechanism of conveying failure is the async API (via
> CFs or callbacks, or whatever other means is there in the API).
> IMHO the implementation should be rewritten to never throw exceptions,
> and always notify failures via CFs.


I see your point. Yes, working asynchronously kind of suggests that we should
relay all exceptions through the asynchronous interface (CF in this particular
case), simply because otherwise we would have to handle exceptions in many
places (like you have illustrated).

But there's a force working in the opposite direction. The further the thrown
exception is from it's origin, the more of its context we loose.

In addition to this, some exceptions would clearly mean clear programmatic
errors, that could easily be detected on the spot. E.g. passing null, or a
negative value, where the positive or 0 is expected. Or a state problem. E.g.
trying to send more data after a Close message has been already sent. The same
with permits exhausting (let's now forget you're looking into the implementation
rather than in the API). In the API only 1 incomplete/outstanding/queued write
is permitted at any given time. It's a an application's fault if it doesn't keep
an eye on whether or not the previous send has been completed. I can hardly
imagine what could be done in the catch block above in this case.

BTW, whether or not a single outstanding write is appropriate here is a
completely different question.

> Threading.
> ---
> WebSocket.sendXXX() calls
> MessagePublisher.send(), which dispatches a to
> MessagePublisher.react(), which calls
> MessageSender.onNext(), which dispatches to
> MessageSender.react(), which calls
> MessageSender.send(), which calls
> Writer.onNext(), which dispatches to
> Writer.write(), which finally writes the bytes to the network.
> 
> There are 3 (!) dispatches to different threads for a WebSocket send().
> This is an enormous waste of resources which is completely unnecessary.
> 
> While I understand the goal of this implementation is not to be the
> fastest ever written, dispatching every WebSocket send() to 3 threads
> is not what I would expect from the JDK.

All these are the stages of the "pipeline", each of which is independent to a
degree. I've tried to break unnecessary temporal coupling between them.

> Send of WebSocket messages should be done in the caller thread, with
> *zero* dispatches in between.

What's the reason behind this? What about being asynchronous, non-blocking and
responsive? Also please remember since we use ExecutorService, a task does not
necessarily correspond to a separate thread. Actually all this could run with a
same thread executor (try it!):

    @Override
    public void execute(Runnable command) {
        command.run();
    }

I had my reasons behind this while implementing. Would you agree it's generally
more desirable first to make something work correctly and then to make it run
fastER (the former is not mutually exclusive to the latter)?

By introducing the SignalHandler I separated an invocation from the actual
execution, thus making all components behave more like an Active Object.
Initially I was kind of scared about running into hardly trackable deadlocks
and/or unnecessarily thread blocks.

> A read event from the network dispatches to
> Reader.react(), which fake dispatches to
> BuffersSubscriber.onNext() which dispatches to
> BuffersSubscriber.react() which calls
> Reader.readFrame() which calls
> BuffersSubscriber.deliver() which fake dispatches to
> WebSocket.Listener (application code).
> 
> There are 2 dispatches to different threads for every WebSocket read
> event. I can live with 1 dispatch, 2 are unnecessary.

(Same arguments as the above)

> I am not sure that Reader is implementing Flow (ReactiveStreams) correctly.
> Reader.react() calls subscriber.onNext() from thread T1.
> The subscriber handles the message and calls request(1), which calls
> handler.signal() which dispatches T2 to Reader.react() so now there
> are 2 threads inside Reader.react(), which is bad.

First of all, java.net.http.Reader#react has the following structure:

    private void react() {
        synchronized (this) {
        ...
        }
    }

Therefore it's not possible for more than a single thread to be inside
Reader.react() at any given time. Secondly, if you have a look at SignalHandler
more closely you will see that it doesn't allow a handler action (supplied in
the constructor) to be executed simultaneously by many threads. The
synchronization here therefore is purely for memory visibility purposes.

> I am also not sure that Reader is handling buffers correctly.
> A buffer is used to read from the network, then "shared" (not sure
> what is the semantic of this "sharing") and a slice passed to the
> application.
> Then the read loop comes over, and I understand that it may find the
> read buffer consumed and dispose() it.
> That disposed buffer can now be used to read more from the network,
> overwriting the data previously read and therefore invalidating the
> slice passed to the application.

Shared<Buffer> represents a Buffer whose contents are sliced without overlaps.
Given the slices are then published safely to their target threads they can be
simultaneously read and/or write.

The root buffer (the one that hosts all the slices) counts its children. Each
dispose() decrements total count and does nothing, unless the resulting count 
is 0. I which case this Shared<Buffer> (the parent) goes back to the pool. 
Simple
reference counting, no magic. In theory should provide better latency and more
concurrency without any need for copying.

Maybe it (dispose) just needs a better name?

> I am also dubious about the usage of the CF returned by onXXX()
> methods: if the read buffer is sliced and the slice passed to the
> application, the underlying memory of both the read and sliced buffers
> cannot be touched until the CF is completed.

True for the sliced buffer. Not sure I understand why it should be the case for
the read buffer.

I've just sliced a part that represents something deliverable, perfect, why
can't I proceed with reading the next chunk of data?

> I think that WebSocket.Listener javadoc need a clarification of the
> exact semantic of the CF returned by onXXX() methods and provide a
> couple of examples, like A) queueing messages without consuming them
> (but still calling request(1)) and B) passing them to other
> asynchronous APIs.
> It also needs a clarification of what happens if an application never
> returns (or returns after a long time) from onXXX() methods.

Agree.

> I kind of stopped here.
> I understand this is a first draft, but IMHO the implementation needs
> a major overhaul to simplify it.

I'm keeping this in mind. Don't worry.

Thanks again!

Re: RFR JDK-8087113: Websocket API and implementation

Reply via email to