Here's a new version of the design document, that incorporates the results of your suggestions. I hope this is better...
Jon Gzip Content-Encoding in Squid Design Version Choice The goal will be to get these changes into Squid3 HEAD. Content-Encoding Protocol Because current browser implementations treat Content-Encoding much as though it was Transfer-Encoding, we will implement Content-Encoding and Accept-Encoding as though they were actually the Transfer-Encoding and TE described in the HTTP specifications. Etags of replies encoded by Squid will be modified to turn them into weak tags if they are not already so. There will be a configuration option to turn off content-encoding. Content-Encoding Implementation New HttpHdrContCode module, that parses related HTTP headers, and arranges for encoding or decoding appropriately. Includes the following functions: * codeParseRequest(): Called from client_side:parseHttpRequest() after clientStreamInit() call. Checks for and parses Allow-Encoding headers. Instantiates content_coding appropriately, and calls codeClientStreamInit(). * codeClientStreamInit(): Adds a new node to clientStream with codeStreamRead(), codeStreamCallback(), and codeStreamStatus() functions. * codeStreamCallback()set up encoding/decoding state depending on combination of Content-Encoding and Allow-Encoding fields seen. * codeStreamRead(): call HttpContentCoder transformation functions appropriately. * codeStreamStatus(): report status to stream. * codeDupNode(): Alloc new store_entry and insert new clientStream dup node (see below) to (v?)copy data to store_entry as well as reply. New HttpContentCoder abstract type, with functions: * encodeStart() * encodeEnd() * encodeChunk() * decodeStart() * decodeEnd() * decodeChunk() New per-coded-object ContentCoderState, to handle coding state. It'll be referenced from the clientStream, and include fields: * HttpContentCoder *coder * off_t codedOffset Objects will be stored both in unencoded and encoded formats. An object will stay in the format in which Squid receives it until requested by a client requesting a different Content-Encoding which Squid supports (this could be immediate). Once this happens, the object will be streamed coded into a different StoreEntry and on to the client. A new store_dup module will be created to manage dup store_entries and make sure duplicate entries are invalidated when a new version of an object is read. It consists of a circular list of StoreEntry pointers named "dupnext" and "dupprev" When a new duplicate encoding (or decoding) of an object is created, it's added to the list. When any StoreEntry is invalidated or updated, all dups are invalidated. Functions: * storeNewDup(): called from codeDupNode(), above, and creates new node with the dup'ed node attached via the dup list. * storeDupClientStreamInit(): called from codeDupNode(), and adds new clientStreamNode to copy off encoded data to new node as well as reply. * storeDupClientStreamRead(): does copying off. * storeDupClientStreamCallback(): null function * storeDupClientStreamStatus(): returns status Other changes needed: *Add new content_coding field to HttpReply. *New httpHeaderGetContentEncoding(HttpReply *) function in HttpHeader.cc. *HttpReply:httpReplySetHeaders will weaken the etag if appropriate. *A new configuration flag to turn content-encoding off, if desired. Gzip A new GzipContentCoder module, which will be an instance of HttpContentCoder. Data encoding will be handled by the gzip.org zlib library. Functions: * gzEncodeStart: call inflateInit2(), write header * gzEncodeEnd: write trailer * gzEncodeChunk: call inflate() * gzDecodeStart: call deflateInit2(), read and verify header * gzDecodeEnd: verify trailer * gzDecodeChunk: call deflate() * gzDoSaveEncoded(): true Test Strategy Must pass the test suite. Must add appropriate tests, including sending gzipped content to oneself successfully. Will also test against Apache mod_gzip implementation, and maybe even gunzip.