Re: [mongrel2] Need save memory sending big responses from handlers

Justin Karneges Mon, 08 Oct 2012 11:36:42 -0700

On Monday, October 08, 2012 11:15:57 AM Jason Miller wrote:
> On 17:45 Fri 05 Oct     , Justin Karneges wrote:
> > On Friday, October 05, 2012 04:56:17 PM Jason Miller wrote:
> > > The only thing that is hard is streaming a large amount of data that
> > > isn't a file.
> > 
> > I have this problem as I'm developing an HTTP proxy. I want to be able to
> > send data back to the client as each packet of an HTTP response is
> > received, as opposed to buffering an entire HTTP response in the proxy
> > before sending it to the client. So, this is not a file source, but I do
> > need some kind of flow control. Fixed rate is an unfortunate solution.
> 
> Fair enough
> 
> > > For example, the extended reply system is flexible enough that we could
> > > use it to establish a plain-old TCP (or unix domain) connection betweem
> > > m2 and the handler to send the data over on a 1 connection per client ID
> > > basis.  The problem with zeromq is that it buffers arbitrarily large
> > > amounts of data under-the-hood.  Streaming might make more sense over
> > > TCP.
> > 
> > Yes, 1 to 1 TCP connections for streaming could work. It's probably the
> > most straightforward, but it does mean lots of extra fds being used, and
> > it may require some socket judo to ensure you don't wind up with a
> > thread-per- connection.
> > 
> > The other idea is credits-based flow control, which I mentioned in an
> > earlier email. Here's how I would do it:
> > 
> > Have M2 offer a third socket, ROUTER type, intended for delivering
> > messages
> > directly to known handlers. Basically it will be used to send credits,
> > although it could have more uses in the future. Handlers that need to
> > stream responses should connect to this socket so that they may receive
> > the credits messages. Handlers should set ZMQ_IDENTITY to some value
> > (even a random value generated by the handler is fine, it just needs to
> > be set to /something/ so that the handler becomes referenceable).
> > 
> > When M2 pushes a request to an arbitrary handler, an initial number of
> > credits in bytes, e.g. 200000, is provided as an integer value in the
> > request. When a handler pubs its first response message for this request
> > id, it should include the identity of the socket it used to connect to
> > M2's ROUTER socket. M2 will then associate the handler's socket id to the
> > request.
> > 
> > Whenever M2 successfully writes data to a client connection, it sends a
> > message over the ROUTER socket directly to the handler associated with the
> > request containing an integer value of credits equal to the number of
> > bytes
> > written.
> > 
> > The handler's job is to ensure it does not send more data to M2 than is
> > allowed by the credits. M2 itself doesn't actually have to enforce credits
> > nor even record credits. It can trust handlers to not be evil.
> > 
> > This approach would allow keeping the number of sockets and workers fixed.
> > 
> > Note: The third socket also opens the door for non-file-based streaming of
> > large inbound requests. This is another feature I want, too. ;) Some
> > handler could ack the initial request message and include its identity,
> > and then M2 could stream the rest of the body using the ROUTER socket to
> > send specifically to the handler that claimed responsibility of reading
> > the request.
> > 
> > Justin
> 
> This is similar to an  idea I had that would be simpler to implement
> with the current architecture would be to have a 3rd socket that is
> PUB/SUB (with mongrel2 being the PUB side and handlers being the SUB
> side), and whenever a connection gets within a certain amount of the
> limit it publishes a flow-control message.  It's basically the same
> idea, but bang/bang rather than value based, and it just publishes them
> out there for handlers to care about.  If the handler ignores it, then
> mongrel2 will close the connection the same way it does now.  Without
> some sort of behavior of mongrel2 closing highly buffered connections,
> it becomes too easy to accidentally write a handler that is vulnerable
> to DoS attacks, which is something I want to avoid.


Yep, PUB as a third socket would work, too. The main point is just to have a 
way to speak to the same handler more than once. This would also work for my 
off-topic suggestion of streaming inbound requests (mongrel2 could pub a series 
of messages).

Justin

Re: [mongrel2] Need save memory sending big responses from handlers

Reply via email to