Thanks, Barry. We'll discuss this at the next ODP public call.

On Fri, Sep 8, 2017 at 5:06 PM, Barry Spinney <spin...@mellanox.com> wrote:
>
> Here is my promised document on "Lossless  Data Compression Requirements and 
> Use Cases",
> in response to Shally's request for compression Use cases.  Mostly a complete 
> first draft, with
> just a couple of still "To Be Done" parts.  I seem to recall some constraints 
> on the mailing
> lists of the use of email attachments and rich document formats, so I chose 
> to use a simple
> text file format - despite the fact that it looks a little ugly.
>
> **********************************************************************
>
>
>                           Lossless Data Compression
>                          Requirements and Use Cases
>
>    Sep 8/2017                                                   Barry Spinney
>
>
>     This document describes the Use Cases that would justify a HW based
> compression/decompression engine on Mellanox SoC chips.  An API for such a
> compression engine is currently being considered for addition to the
> OpenDataPlane (ODP) standard.  For these Use Cases some basic requirements are
> also listed.
>
>     This document considers a number of uses of data compression in computer
> systems - and especially those uses that are networking related.  For a number
> of such uses, we note whether that usage would justify the addition of
> compression/decompression HW for Mellanox chips, as well as how this use case
> would affect the ODP API.  This document is organized as a taxonomy or flow
> chart of the compression/decompression categories.
>
>
> 1 Low Speed Uses:
> -----------------
>     Note also that we ignore many uses of compression where it is usually done
> offline (often in SW) OR where it happens in real-time but at low speeds.
> Here we define low speeds as those use cases that can be handled fairly
> reasonably in SW and so don't justify special compression/decompression HW.
> Today this usually means speeds less than 1 Gbps.
>
>
> 2 High Speed Uses:
> ------------------
>
>
> 2.1 Lossy versus Lossless:
> --------------------------
>     First we discuss the two major compression categories of lossy compression
> use cases and lossless compression use cases.
>
>
> 2.1.1 Lossy Compression:
> ------------------------
>     Lossy compression/decompression is an important use of compression in
> the Internet.  In particular lossy audio and video compression is wide-spread,
> however the algorithms used here - like those described in the MPEG standards 
> -
> are both very different and far more complex than the algorithms used for
> lossless data compression.  Often the compression can be done off-line rather
> than in real time - though decompression generally needs to be done in real
> time.
>
>     Since the current ODP compression API is designed for lossless 
> compression,
> the lossy use cases are ignored/irrelevant.
>
>
> 2.1.2 Lossless Compression:
> ---------------------------
>     Lossless data compression generally uses a small set of simpler 
> algorithms -
> which are more amenable to HW implementation.  This is the only category
> considered for the current ODP HW.
>
>
> 2.1.2.1 Offline vs Real-time:
> -----------------------------
>     Lossless data compression/decompression can be done "offline" or in
> "real-time".  By real-time we mean uses where the data to be compressed or
> decompressed is arriving over a high speed network port and needs to be dealt
> with at network speeds and with low latency.
>    <TBD better definition of offline/real-time>
>
>
> 2.1.2.2 Offline:
> ----------------
>     An important offline use case could be whole file compression.  However 
> this
> is currently often done by the OS (somewhat transparently) and implemented in
> Kernel SW.
>
>     Another example is compression/decompression of large files/archives
> for download (e.g. a compressed ISO file or a compressed tar file).
> Unfortunately, while gzip and zip are still used, it is more common today
> to instead use algorithms like bzip2 and xz.  But these newer algorithms are
> much more difficult to implement in HW because of the massive history buffers
> that are used.
>
>     While these use cases cannot justify the addition of compression/
> decompression HW, if such HW exists for other uses, then it might be nice if
> the HW could handle this case as well.  However this should be
> considered a low priority.
>
>
> 2.1.2.3 Real-time:
> ------------------
>     Real-time use cases occur as a result of some (standardized?) network
> protocol.  These protocols can be divided into:
>     a) those that compress at the packet level - like IPComp and IPsec
>     b) those that compress at the disk sector/block level
>     c) those that compress at the file level
>
>
> 2.1.2.3.1 Packet Level Compression:
> -----------------------------------
>     Packet level compression/decompression - on high speed networks - is not
> that common, with perhaps the exception being the IPComp protocol (RFC 3173) -
> especially in conjunction with the IPsec protocols (several dozen RFC's),
> which is the only use case discussed.
>
>
> 2.1.2.3.1.1 IPComp/IPsec:
> -------------------------
>     While IPComp in theory could be used in a fairly general way, in practice
> today, it is almost always used as the "compression" part of the IPsec 
> protocol.
> So we ignore all other uses of IPComp.
>
>     While low speed IPsec tunnels might use and benefit from IPComp 
> compression/
> decompression - because it is low speed it is going to be done in SW and has 
> no
> relevance for an ODP HW (i.e. high speed) compression/decompression engine.
>
>     Conversely, compression on high speed IPsec tunnels is less common, BUT
> one could argue that this is primarily because of the lack of HW compression!
> So we consider HW compression to be an enabler for high speed IPsec 
> compression.
> Of course since this use case isn't solving a current common usage, this use
> case may not justify the addition of this HW, BUT if such HW already exists
> (i.e. is justified by some other use case), then this use case becomes fairly
> relevant.
>
>
> 2.1.2.3.2 Disk Sector/Block Compression:
> ----------------------------------------
>     This use case is often not specified in the disk block access protocols
> themselves, but instead is done as a side effect of reading and writing
> disk blocks by protocols like iScsi, NFS, NVMoF, etc.
>     <TBD I wasn't able to reach the Mellanox architect knowledgeable about 
> this
> use case, so this section is still To Be Done>
>
>
> 2.1.2.3.3 File Level Compression:
> ---------------------------------
>     This use case is one of the main use cases that can justify the addition
> of a compression HW engine.
>
>
> 2.1.2.3.3.1 SSL/TLS:
> --------------------
>     While the SSL/TLS protocol includes the capability to do compression/
> decompression as part of its operation - assuming both parties agree that
> they want to do it - it turns out to be almost never used.  Since this feature
> is so rarely used, the upcoming new version of the TLS protocol spec has
> actually removed this capability from the protocol.
>
>     Hence, while on paper, this looks like a promising use case, not only does
> it NOT justify adding compression HW, it is NOT worth considering this use
> case even if such HW is already there!  Hence we completely ignore this case.
>
>
> 2.1.2.3.3.2 HTTP:
> -----------------
>     This is the MOST important use case - both for justifying the addition of
> compression/decompression HW and for actually using it on high speed networks.
>
>     Note that the current IANA HTTP Transfer Coding Registry lists the
> following compression formats (ignoring the old deprecated aliases):
>     a) compress - UNIX "compress" data format
>     b) deflate - "deflate" compressed data inside of the "zlib" data format
>     c) gzip - GZIP file format
>
>     It is IMPORTANT to understand how the HTTP protocol works and how
> compression fits into this protocol.  Some of the details below have been
> simplified to make it easier to understand, hence the description below should
> not be considered accurate or complete, but the concepts and issues raised
> should be correct.
>
>     First of all HTTP is not a packet based protocol but a byte-stream based
> protocol running over TCP.  What this means is that HTTP request headers and
> response headers, along with data stream headers like chunked data headers,
> are not necessarily related to IP packet boundaries.  Instead HTTP sends a
> byte stream to TCP which is then free to carve up these bytes into TCP/IP
> packet payloads in any way it wishes.
>
>     Next, HTTP is not compressing/decompressing pkt payloads, but instead is
> working on whole files being transferred (technically it operates on
> "resources" as indicated by a Uniform Resource Identifier (URI), but the the
> distinction from a compress/decompress perspective is moot).  Although HTTP
> has the concept of many "methods" operating on these resources, in practice 
> the
> vast majority of requests involves methods that simply get or put "files",
> and these files can be (and often are) compressed - at the HTTP level.  Here
> we ignore files that HTTP considers to be not compressed, but the actual
> file format uses some compression algorithm.  Examples are JPEG files
> (typically using lossy compression) and sometimes PNG files (using lossless
> compression).  Since compressing an already compressed file isn't useful,
> one doesn't want to do HTTP compression on e.g. JPEG files.
>
>     The HTTP side receiving files (compressed or not) needs some way to know
> when that file transfer request is done.  With HTTP/1.1, Web Servers will
> usually want to transfer multiple files over a single TCP connection, so the
> "old way" of knowing that the transfer is done by seeing the TCP connection
> terminated, isn't sufficient.  Instead HTTP either uses a Content-Length 
> header
> field OR uses the chunked Transfer-Encoding method.
>
>     In particular chunked encoding is widely used.  This method modifies the
> file body being transferred (compressed or not) by dividing it into a series
> of chunks, where each chunk has an newly inserted chunk header at the start
> of each chunk.  A chunk consists of a chunk-size (variable number of hex 
> digits)
> followed by an optional chunk-extension followed by a CRLF (CRLF means the
> two characters of Carriage Return followed immediately by Line Feed).  Next
> comes that actual file bytes - whose length is indicated by the chunk-size -
> followed by CRLF.  This pattern is then repeated until the entire file has
> been transferred.
>
>     So if this HTTP payload is fed directly into a decompression engine, bad
> things will happen because of the insertion of these chunk headers.  So the
> decompression API needs to have a way to indicate where the file data starts
> in the packet and also where the chunked headers are so that they can be
> skipped.  Note that the file data will often not start immediately after
> a packet's TCP header, because every file transfer has a variable length
> HTTP request header or HTTP response header preceding its first byte.  Also
> the chunked headers could be split across TCP packets in an arbitrary way.
>
>     In addition the result of the decompression of this file isn't logically
> a set of packets, but instead is a (potentially huge) memory buffer holding
> the decompressed/original file data.  By the way this memory buffer often
> will need to actually be implemented as a linked list of contiguous memory
> regions/buffers.  Ideally these regions/buffers would have a size of the
> order of 16K to 64K.  So the preferred API would be one which calls an
> asynchronous decompress function with a batch of pkts and a pointer to one 
> byte
> past where the last decompress byte for this file was written.  The batch of
> input pkts also needs a way to specify where the file data starts and ends
> in each packet (via an byte offset from say the TCP header start plus a
> count of the file bytes in this packet) AND also the byte offsets and lengths
> of the chunked headers.  Note that, while not common, one could have multiple
> chunked headers in the same packet.  Ideally, if one has a large number of
> chunked headers in the same packet (say > 8 or 16), then probably best to
> detect it, abort the connection and call it an attack.
>
>     The preferred compression API is less clear cut.  First consider the
> output part of the API.  One eventually needs to send the compressed file
> data in a series of TCP/HTTP packets.  But if one supplies the decompression
> engine with a set of "empty" packets, then who adds the various protocol
> headers?  The L2 and L3 headers aren't hard, but adding a TCP header is
> a lot harder because of sequence numbers, flags, etc.  But probably harder
> still is adding the variable length HTTP request/response header before the
> compressed data AND the chunk headers in the middle of the compressed data!
> So it is probably better to have the compressed data be appended to a linked
> list of contiguous memory buffers (just like the output for the decompress
> case) and then let the ODP application add the various HTTP headers and 
> passing
> the reason to some ODP compatible TCP stack.
>
> *******************************************************
>
> Thanx Barry.

Reply via email to