G'day again, I've been staring at the comm code in Squid-2 to get an idea of how to implement properly async copy-free network IO. Its tricky to say the least.
I'll stick to the write side for now. So we have comm_write(), which mostly makes copy-free write network IO possible. You pass it a buffer to write which must stay valid for the lifetime of the write. The initial comm_write() API, as far as I can judge, assumed that the caller would allocate a temporary buffer to write and pass in a FREE * func to free the buffer. This way the buffer can be sent even if the callback is cancelled at some point in the lifetime of the write (ie, "fire and mostly forget" writing.) As part of my "wtf performance" run I've gone through the code and tried to remove all the temporary buffer allocations and copies. This has worked - and has sped things up a bit - but it also means that various things could potentially go wrong and garbage may be written because the underlying buffer gets invalidated around the time the callback gets invalidated. This is a bit of a pain in the butt right now as auditing the main place this can happen - src/client_side.c - is a freaking enormous task. It wouldn't be difficult for me to unwind all of that and use a temporary buffer -again-, but to be honest, if it were going to be a problem it would have surfaced by now under heavy load. It hasn't, as far as we can tell, but it doesn't mean it isn't happening in very strange corner cases. Now, the reason I bring this up is in light of copy-free network IO. Again, this requires the underlying buffer to persist until the write is completed or is successfully cancelled. comm_write() uses cbdata and will lazily ignore calling a callback if the pointer gets invalidated, something I don't really like. Instead, I'm going to experiment with a more "normal" approach found elsewhere - the callback stays valid until the write is completed or invalidated; a comm_close() or a call to a mythical comm_cancel_write() will result in the write callback being called with write completion, either cancelled or completed. This will require changes to the object lifecycle - instead of being able to disappear whenever they please, they will have to stay around until their scheduled IO callbacks have completed. Although it may complicate the code slightly, I think it'll make debugging easier as the object -will- have to stay around until its IO callbacks are called somehow. I'm not going to dump any changes into Squid-2 or Squid-3 until I've come up with something which I know works. My general gameplan involves coding up a seperate set of comm_write() and comm_read() like functions - probably io vector ones, with comm_{read,write}() devolving to a single io vector - with stricter defined semantics (ie, written down somewhere!) As an aside: This'll all really become less of a point when a proper reference-counted buffer type is brought in. I'll then be able to do things like write out pages from the memory store and be certain they'll remain stable until they're written, rather than the current situation where I'm writing out references to the pages and simply -hoping- they stay stable. (Now, for those who are scared - _yes_, they stay stable in client_side.c[c] in all of Squid-2.6, 2.7 and 2.HEAD as far as I can tell (and thus they should be in 3.x) - not by API contract, but by sheer luck. ;) Adrian