On 03/08/2014 04:13 AM, Amos Jeffries wrote: > Prepare the way to efficiently parse client requests using SBuf based > parser-ng. > > We rely on several guarantees for this to work: > > * rawSpace(N) guarantees that the backing store is both unique to this > SBuf and has at least N bytes of unused buffer allocated.
Yes, but the problem here is that this guarantee does not last long enough. See below. > * SBuf member retains an active reference to the MemBlob backing store > for the duration of the read operation. Except in cases whete > ConnStateData gets destructed, in which case ConnStateData cancels the > read explicitly. SBuf does not provide such a guarantee AFAICT, even with the above exception. An SBuf object may change its MemBlob, and, IIRC, its MemBlob may change its backing store. Any one of those two actions would be deadly for the patched code. Retaining a reference is not enough to prevent those actions. > * comm_read() is effectively a blocking operation with regards to read > on the client socket. No other read is scheduled until the callback > handler has been run. Yes, but the patched code actually needs a much stronger guarantee which is not provided by the patched Squid: Comm_read() has to be blocking with regard to any other buffer-modifying operation as well. For example, nobody can consume anything from the buffer while the comm_read is pending because consumption may invalidate the free space pointer that the comm has. Why would somebody want to consume the input buffer while a Comm read is pending? This might happen when, for example, the BodyPipe buffer gets free space and notifies the producer (via (ConnStateData's noteMoreBodySpaceAvailable) that the producer can now shovel more request body data into the pipe. The producer shovels, consumes the shoveled data, and SBuf gets cleared. Comm then reads into an invalid (or worse) pointer. Another example are various c_str() calls that the patched code uses with the input buffer. Some of those calls happen when forming error responses, and some happen when handling completely asynchronous (to the transaction holding the buffer) Cache Manager requests! Those calls are allowed to change the buffer storage pointer, but even when they do not, they modify the contents of something we may have just read. I am worried that this design is too dangerous to use, even if it miraculously works now due to extra copies or other temporary/implicit boundaries that currently isolate buffer-modifying code from the buffer reading code. > * parsing and other manipulations of the buffer are done synchronously > by the read handler before scheduling more reads. AFAICT, parsing and other manipulations may happen while the read is pending. The above are just a couple of examples I could find quickly. Please redesign to avoid these problems. Please let me know if you need help with that -- I have a couple of ideas how this can be done, but I suspect yours are going to be better since you have spent a lot more time with this code. > - comm_read(clientConnection, in.addressToReadInto(), > getAvailableBufferLength(), reader); > + comm_read(clientConnection, in.buf.rawSpace(in.buf.spaceSize()), > in.buf.spaceSize()-1, reader); If this code survives, please allocate the buffer outside the comm_read call. There was already an ugly dependency on the right parameter evaluation order, and your changes make it look even worse. More to come, but probably after we settle the design problem discussed in the beginning of this email. Thank you, Alex.
