On Monday, October 08, 2012 11:15:57 AM Jason Miller wrote: > On 17:45 Fri 05 Oct , Justin Karneges wrote: > > On Friday, October 05, 2012 04:56:17 PM Jason Miller wrote: > > > The only thing that is hard is streaming a large amount of data that > > > isn't a file. > > > > I have this problem as I'm developing an HTTP proxy. I want to be able to > > send data back to the client as each packet of an HTTP response is > > received, as opposed to buffering an entire HTTP response in the proxy > > before sending it to the client. So, this is not a file source, but I do > > need some kind of flow control. Fixed rate is an unfortunate solution. > > Fair enough > > > > For example, the extended reply system is flexible enough that we could > > > use it to establish a plain-old TCP (or unix domain) connection betweem > > > m2 and the handler to send the data over on a 1 connection per client ID > > > basis. The problem with zeromq is that it buffers arbitrarily large > > > amounts of data under-the-hood. Streaming might make more sense over > > > TCP. > > > > Yes, 1 to 1 TCP connections for streaming could work. It's probably the > > most straightforward, but it does mean lots of extra fds being used, and > > it may require some socket judo to ensure you don't wind up with a > > thread-per- connection. > > > > The other idea is credits-based flow control, which I mentioned in an > > earlier email. Here's how I would do it: > > > > Have M2 offer a third socket, ROUTER type, intended for delivering > > messages > > directly to known handlers. Basically it will be used to send credits, > > although it could have more uses in the future. Handlers that need to > > stream responses should connect to this socket so that they may receive > > the credits messages. Handlers should set ZMQ_IDENTITY to some value > > (even a random value generated by the handler is fine, it just needs to > > be set to /something/ so that the handler becomes referenceable). > > > > When M2 pushes a request to an arbitrary handler, an initial number of > > credits in bytes, e.g. 200000, is provided as an integer value in the > > request. When a handler pubs its first response message for this request > > id, it should include the identity of the socket it used to connect to > > M2's ROUTER socket. M2 will then associate the handler's socket id to the > > request. > > > > Whenever M2 successfully writes data to a client connection, it sends a > > message over the ROUTER socket directly to the handler associated with the > > request containing an integer value of credits equal to the number of > > bytes > > written. > > > > The handler's job is to ensure it does not send more data to M2 than is > > allowed by the credits. M2 itself doesn't actually have to enforce credits > > nor even record credits. It can trust handlers to not be evil. > > > > This approach would allow keeping the number of sockets and workers fixed. > > > > Note: The third socket also opens the door for non-file-based streaming of > > large inbound requests. This is another feature I want, too. ;) Some > > handler could ack the initial request message and include its identity, > > and then M2 could stream the rest of the body using the ROUTER socket to > > send specifically to the handler that claimed responsibility of reading > > the request. > > > > Justin > > This is similar to an idea I had that would be simpler to implement > with the current architecture would be to have a 3rd socket that is > PUB/SUB (with mongrel2 being the PUB side and handlers being the SUB > side), and whenever a connection gets within a certain amount of the > limit it publishes a flow-control message. It's basically the same > idea, but bang/bang rather than value based, and it just publishes them > out there for handlers to care about. If the handler ignores it, then > mongrel2 will close the connection the same way it does now. Without > some sort of behavior of mongrel2 closing highly buffered connections, > it becomes too easy to accidentally write a handler that is vulnerable > to DoS attacks, which is something I want to avoid.
Yep, PUB as a third socket would work, too. The main point is just to have a way to speak to the same handler more than once. This would also work for my off-topic suggestion of streaming inbound requests (mongrel2 could pub a series of messages). Justin
