HI Fiona

This is an RFC document to brief our understanding and requirements on 
compression API proposal in DPDK. It is based on "[RFC] Compression API in DPDK 
http://dpdk.org/ml/archives/dev/2017-October/079377.html";.
Intention of this document is to align on concepts built into compression API, 
its usage and identify further requirements. 

Going further it could be a base to Compression Module Programmer Guide.

Current scope is limited to
- definition of the terminology which makes up foundation of compression API
- typical API flow expected to use by applications
 
Overview
~~~~~~~~
A. Notion of a session in compression API
================================== 
A Session is per device logical entity which is setup with chained-xforms to be 
performed on burst operations where individual entry contains operation type 
(decompress/compress) and related parameter.
A typical Session parameter includes:
- compress / decompress
- dev_id 
- compression algorithm and other related parameters
- mempool - for use by session for runtime requirement
- and any other associated private data maintained by session
 
Application can setup multiple sessions on a device as dictated by 
dev_info.nb_sessions or nb_session_per_qp.
 
B. Notion of burst operations in compression API
 =======================================
struct rte_comp_op defines compression/decompression operational parameter and 
makes up one single element of burst. This is both an input/output parameter. 
PMD gets source, destination and checksum information at input and updated it 
with bytes consumed and produced at output. 
Once enqueued for processing, rte_comp_op *cannot be reused* until its status 
is set to RTE_COMP_OP_FAILURE or RTE_COMP_OP_STATUS_SUCCESS.
 
C. Session and rte_comp_op
 =======================
Every operation in a burst is tied to a Session. More to cover on this under 
Stateless Vs Stateful section.
 
D. Stateless Vs Stateful
===================
Compression API provide RTE_COMP_FF_STATEFUL feature flag for PMD to reflect 
its support for Stateful operation. 
 
D.1 Compression API Stateless operation
------------------------------------------------------ 
A Stateless operation means all enqueued packets are independent of each other 
i.e. Each packet has
-              Their flush value is set to RTE_FLUSH_FULL or RTE_FLUSH_FINAL 
(required only on compression side),
-              All-of the required input and sufficient large buffer size to 
store output i.e. OUT_OF_SPACE can never occur (required during both 
compression and decompression)
 
In such case, PMD initiates stateless processing and releases acquired 
resources after processing of current operation is complete i.e. full input 
consumed and full output written.
Application can attach same or different session to each packet and can make 
consecutive enque_burst() calls i.e. Following is relevant usage:
 
enqueued = rte_comp_enque_burst (dev_id, qp_id, ops1, nb_ops); 
enqueued = rte_comp_enque_burst(dev_id, qp_id, ops2, nb_ops);  
enqueued = rte_comp_enque_burst(dev_id, qp_id, ops3, nb_ops); 
 
*Note – Every call has different ops array i.e.  same rte_comp_op array *cannot 
be reused* to queue next batch of data until previous ones are completely 
processed.

Also if multiple threads calls enqueue_burst() on same queue pair then it’s 
application onus to use proper locking mechanism to ensure serialized enqueuing 
of operations.
 
Please note any time output buffer ran out of space during write then operation 
will turn “Stateful”.  See more on Stateful under respective section.

 Typical API(flow-wise) to setup for stateless operation:
1. rte_comp_session *sess = rte_comp_session_create(rte_mempool *pool);  
2. rte_comp_session_init (int dev_id, rte_comp_session *sess, rte_comp_xform 
*xform, rte_mempool *sess_pool);  
3. rte_comp_op_pool_create(rte_mempool ..)  
4. rte_comp_op_bulk_alloc (struct rte_mempool *mempool, struct rte_comp_op 
**ops, uint16_t nb_ops);  
5. for every rte_comp_op in ops[],
    5.1 rte_comp_op_attach_session(rte_comp_op *op, rte_comp_session *sess);
    5.2 set up with src/dst buffer 
6. enq = rte_compdev_enqueue_burst(uint8_t dev_id, uint16_t qp_id, struct 
rte_comp_op **ops, uint16_t nb_ops); 
7. dqu = rte_compdev_dequeue_burst(dev_id, qp_id, ops, enq);
8. repeat 7 while (dqu < enq) // Wait till all of enqueued are dequeued
9. Repeat 5.2 for next batch of data  
10. rte_comp_session_clear () // only reset private data memory area and *not* 
the xform and devid information. In case, you want to re-use session. 
11. rte_comp_session_free(ret_comp_sess *session) 

D.1.2 Requirement for Stateless
-------------------------------------------
Since operation can complete out-of-order. There should be one (void *user) per 
rte_comp_op to enable application to map dequeued op to enqueued op.

D.2 Compression API Stateful operation
----------------------------------------------------------
 A Stateful operation means following conditions:
- API ran into out_of_space situation during processing of input. Example, 
stateless compressed stream fed fully to decompressor but output buffer is not 
large enough to hold output.
- API waiting for more input to produce output. Example, stateless compressed 
stream fed partially to decompressor.
- API is dependent on previous operation for further compression/decompression

In case of either one or all of the above conditions PMD is required to 
maintain context of operations across enque_burst() calls, until a packet with  
RTE_FLUSH_FULL/FINAL and sufficient input/output buffers is received and 
processed.
 
D.2.1 Compression API requirement for Stateful
---------------------------------------------------------------

D.2.1.1 Sliding Window Size
------------------------------------
Maximum length of Sliding Window in bytes. Previous data lookup will be 
performed up to this length. To be added as algorithm capability parameter and 
set by PMD.
 
D.2.1.2 Stateful operation state maintenance
 -------------------------------------------------------------
This section starts with description of our understanding about compression API 
support for stateful. Depending upon understanding build upon these concepts, 
we will identify required data structure/param to maintain in-progress 
operation context by PMD.
 
For stateful compression, batch of dependent packets starts at a packet having 
RTE_NO_FLUSH/RTE_SYNC_FLUSH flush value and end at packet having 
RTE_FULL_FLUSH/FINAL_FLUSH. i.e. array of operations will carry structure like 
this:

------------------------------------------------------------------------------------
|op1.no_flush | op2.no_flush | op3.no_flush | op4.full_flush|
------------------------------------------------------------------------------------
 
For sake of simplicity, we will use term "stream" to identify such related set 
of operation in following description. 
 
Stream processing impose following limitations on usage of enque_burst() API
-              All dependent packets in a stream should carry same session
-              if stream is broken into multiple enqueue_burst() call, then 
next enqueue_burst() cannot be called until previous one has fully processed. 
I.E.

               Consider for example, a stream with ops1 ..ops7,  This is *not* 
allowed
               
                                       
----------------------------------------------------------------------------------
                enque_burst(|op1.no_flush | op2.no_flush | op3.no_flush | 
op4.no_flush|)
                                       
----------------------------------------------------------------------------------
 
                                       
----------------------------------------------------------------
               enque_burst(|op5.no_flush | op6.no_flush | op7.flush_final |)
                                        
----------------------------------------------------------------
 
              This *is* allowed
                                       
----------------------------------------------------------------------------------
               enque_burst(|op1.no_flush | op2.no_flush | op3.no_flush | 
op4.no_flush|)
                                       
----------------------------------------------------------------------------------
 
                deque_burst(ops1 ..ops4)
 
                                       
----------------------------------------------------------------
               enque_burst(|op5.no_flush | op6.no_flush | op7.flush_final |)
                                        
----------------------------------------------------------------

-              A single enque_burst() can carry only one stream. I.E. This is 
*not* allowed
               
                                      
---------------------------------------------------------------------------------------------------------
              enque_burst (|op1.no_flush | op2.no_flush | op3.flush_final | 
op4.no_flush | op5.no_flush |)
                                       
---------------------------------------------------------------------------------------------------------

If a stream is broken in to several enqueue_burst() calls, then compress API 
need to maintain operational state between calls. For this, concept of 
rte_comp_stream is enabled in to compression API. 
Here’re the proposed changes to existing design:

1. Add rte_comp_op_type 
........................................
enum rte_comp_op_type {
RTE_COMP_OP_STATELESS,
RTE_COMP_OP_STATEFUL
}

2. Add new data type rte_comp_stream to maintain stream state
........................................................................................................
rte_comp_stream is an opaque data structure to application which is exchanged 
back and forth between application and PMD during stateful 
compression/decompression. 
It should be allocated per stream AND before beginning of stateful operation. 
If stream is broken into multiple enqueue_burst() then each 
respective enqueue_burst() must carry same rte_comp_stream pointer. It is 
mandatory input for stateful operations.
rte_comp_stream can be cleared and reused via compression API 
rte_comp_stream_clear() and free via rte_comp_stream_free(). Clear/free should 
not be called when it is in use.

This enables sharing of a session by multiple threads handling different 
streams as each bulk ops carry its own context. This can also be used by PMD to 
handle OUT_OF_SPACE situation.

3. Add stream allocate, clear and free API 
...................................................................
3.1. rte_comp_op_stream_alloc(rte_mempool *pool, rte_comp_op_type type, 
rte_comp_stream **stream);
3.2. rte_comp_op_stream_clear(rte_comp_stream *stream); // in this case stream 
will be useable for new stateful batch
3.3. rte_comp_op_stream_free(rte_comp_stream *stream); // to free context

4. Add new API rte_compdev_enqueue_stream()
...............................................................................
 static inline uint16_t rte_compdev_enqueue_stream(uint8_t dev_id, 
                                                 uint16_t qp_id, 
                                                 struct rte_comp_op **ops, 
                                                 uint16_t nb_ops,
                                                rte_comp_stream *stream); //to 
be passed with each call

Application should call this API to process dependent set of data OR when 
output buffer size is unknown.

rte_comp_op_pool_create() should create mempool large enough to accommodate 
operational state (maintained by rte_comp_stream) based on rte_comp_op_type. 
Since rte_comp_stream would be maintained by PMD, thus allocating it from PMD 
managed pool offers performance gains.

API flow: rte_comp_op_pool_create() -→ rte_comp_op_bulk_alloc() ---> 
rte_comp_op_stream_alloc() → enque_stream(..ops, .., stream)

D.2.1.3 History buffer
-----------------------------
Will be maintained by PMD with in rte_comp_stream

Thanks
Shally

Reply via email to