Re: [mongrel2] Need save memory sending big responses from handlers

Justin Karneges Fri, 05 Oct 2012 17:46:29 -0700

On Friday, October 05, 2012 04:56:17 PM Jason Miller wrote:
> The only thing that is hard is streaming a large amount of data that
> isn't a file.


I have this problem as I'm developing an HTTP proxy. I want to be able to send 
data back to the client as each packet of an HTTP response is received, as 
opposed to buffering an entire HTTP response in the proxy before sending it to 
the client. So, this is not a file source, but I do need some kind of flow 
control. Fixed rate is an unfortunate solution.

> What are the use cases for that?  I know there are some, but there might
> be few enough (or the majority might be similar enough) that a targeted
> solution would make more sense then a general purpose "This will let you
> stream anything without using too much RAM or loading down the server
> too much" which is a non-trivial problem to solve.
> 
> For example, the extended reply system is flexible enough that we could
> use it to establish a plain-old TCP (or unix domain) connection betweem
> m2 and the handler to send the data over on a 1 connection per client ID
> basis.  The problem with zeromq is that it buffers arbitrarily large
> amounts of data under-the-hood.  Streaming might make more sense over
> TCP.

Yes, 1 to 1 TCP connections for streaming could work. It's probably the most 
straightforward, but it does mean lots of extra fds being used, and it may 
require some socket judo to ensure you don't wind up with a thread-per-
connection.

The other idea is credits-based flow control, which I mentioned in an earlier 
email. Here's how I would do it:

Have M2 offer a third socket, ROUTER type, intended for delivering messages 
directly to known handlers. Basically it will be used to send credits, 
although it could have more uses in the future. Handlers that need to stream 
responses should connect to this socket so that they may receive the credits 
messages. Handlers should set ZMQ_IDENTITY to some value (even a random value 
generated by the handler is fine, it just needs to be set to /something/ so 
that the handler becomes referenceable).

When M2 pushes a request to an arbitrary handler, an initial number of credits 
in bytes, e.g. 200000, is provided as an integer value in the request. When a 
handler pubs its first response message for this request id, it should include 
the identity of the socket it used to connect to M2's ROUTER socket. M2 will 
then associate the handler's socket id to the request.

Whenever M2 successfully writes data to a client connection, it sends a 
message over the ROUTER socket directly to the handler associated with the 
request containing an integer value of credits equal to the number of bytes 
written.

The handler's job is to ensure it does not send more data to M2 than is 
allowed by the credits. M2 itself doesn't actually have to enforce credits nor 
even record credits. It can trust handlers to not be evil.

This approach would allow keeping the number of sockets and workers fixed.

Note: The third socket also opens the door for non-file-based streaming of 
large inbound requests. This is another feature I want, too. ;) Some handler 
could ack the initial request message and include its identity, and then M2 
could stream the rest of the body using the ROUTER socket to send specifically 
to the handler that claimed responsibility of reading the request.

Justin

Re: [mongrel2] Need save memory sending big responses from handlers

Reply via email to