Re: Piercing the veil

James Peach Sat, 26 Jan 2013 10:44:20 -0800

On 26/01/2013, at 8:55 AM, "Owens, Steve" <[email protected]> wrote:


> James,
> 
> Thank you for clearing that up.  It might be helpful to update the example
> plugins as well as the man pages because for example in the null transform
> plugin we have
> 
> if (towrite > 0) {
>    /* The amount of data left to read needs to be truncated by
>     * the amount of data actually in the read buffer.
>     */
>    avail = TSIOBufferReaderAvail(TSVIOReaderGet(input_vio));
>    TSDebug("null-transform", "\tavail is %" PRId64 "", avail);
>    if (towrite > avail) {
>      towrite = avail;
>    }
> 
>    if (towrite > 0) {
>      /* Copy the data from the read buffer to the output buffer. */
>      TSIOBufferCopy(TSVIOBufferGet(data->output_vio),
> TSVIOReaderGet(input_vio), towrite, 0);
> 
>      /* Tell the read buffer that we have read the data and are no
>       * longer interested in it.
>       */
>      TSIOBufferReaderConsume(TSVIOReaderGet(input_vio), towrite);
> 
>      /* Modify the input VIO to reflect how much data we've
>       * completed.
>       */
>      TSVIONDoneSet(input_vio, TSVIONDoneGet(input_vio) + towrite);
>    }
>  }
> 
> Which does not handle the case where the downstream buffer may be full.

Note that I'm not claiming that this can happen; just that the API contract 
allows it and that it's best to stick to the letter of the law :)

>  I am glad I asked.
> 
> Cheers,
> 
> Steve Owens
> 
> 
> 
> On 1/25/13 9:29 PM, "James Peach" <[email protected]> wrote:
> 
>> On 25/01/2013, at 2:14 PM, "Owens, Steve" <[email protected]> wrote:
>> 
>>> In my attempts to understand the mechanics of TSIOBufferCopy I found
>>> the following function definition in InkIOCoreAPI.cc
>> 
>> TSIOBufferCopy copies data from a TSIOBufferReader into a TSIOBuffer.
>> IMHO it doesn't really matter whether it copies bytes or whether it
>> shares references to internal buffers. It's just efficiently sucking data
>> out of the reader.
>> 
>>> 
>>> int64_t
>>> TSIOBufferCopy(TSIOBuffer bufp, TSIOBufferReader readerp, int64_t
>>> length, int64_t offset)
>>> {
>>>  sdk_assert(sdk_sanity_check_iocore_structure(bufp) == TS_SUCCESS);
>>>  sdk_assert(sdk_sanity_check_iocore_structure(readerp) == TS_SUCCESS);
>>>  sdk_assert((length >= 0) && (offset >= 0));
>>> 
>>>  MIOBuffer *b = (MIOBuffer *)bufp;
>>>  IOBufferReader *r = (IOBufferReader *)readerp;
>>> 
>>>  return b->write(r, length, offset);
>>> }
>>> 
>>> Note that it is undocumented but apparently the b->write(r, length,
>>> offset) does the work.
>>> 
>>> I believe I found the declaration of that function in I_IOBuffer.h  One
>>> thing that sort of stuck out to me in the set of code comments below,
>>> was the fact that it says
>>> 
>>> "Should it be necessary to make a large number of small transfers, it's
>>> preferable to use a interface that copies the data rather than sharing
>>> blocks to prevent
>>> a build of blocks on the buffer"
>>> 
>>> What confuses me is isn't TSIOBufferCopy supposed to be an interface
>>> that copies data rather than sharing blocks?
>> 
>> It's documented (in the v2 docs) as as sharing data blocks, but IMHO that
>> is an implementation detail that doesn't really matter for a plugin. The
>> plugin only cares whether the data is available in the right places.
>> TSIOBufferWrite() would always copy the data from an arbitrary buffer
>> (ie. no sharing).
>> 
>>> 
>>> 
>>> 
>>>  /**
>>>    Add by data from IOBufferReader r to the this buffer by reference.
>>> If
>>>    len is INT64_MAX, all available data on the reader is added. If len
>>> is
>>>    less than INT64_MAX, the smaller of len or the amount of data on the
>>>    buffer is added. If offset is greater than zero, than the offset
>>>    bytes of data at the front of the reader are skipped. Bytes skipped
>>>    by offset reduce the number of bytes available on the reader used
>>>    in the amount of data to add computation. write() does not respect
>>>    watermarks or buffer size limits. Users of write must implement
>>>    their own flow control. Returns the number of bytes added. Each
>>>    write() call creates a new IOBufferBlock, even if it is for one
>>>    byte. As such, it's necessary to exercise caution in any code that
>>>    repeatedly transfers data from one buffer to another, especially if
>>>    the data is being read over the network as it may be coming in very
>>>    small chunks. Because deallocation of outstanding buffer blocks is
>>>    recursive, it's possible to overrun the stack if too many blocks
>>>    have been added to the buffer chain. It's imperative that users
>>>    both implement their own flow control to prevent too many bytes
>>>    from becoming outstanding on a buffer that the write() call is
>>>    being used and that care be taken to ensure the transfers are of a
>>>    minimum size. Should it be necessary to make a large number of small
>>>    transfers, it's preferable to use a interface that copies the data
>>>    rather than sharing blocks to prevent a build of blocks on the
>>> buffer.
>>> 
>>>  */
>>>  inkcoreapi int64_t write(IOBufferReader * r, int64_t len = INT64_MAX,
>>> int64_t offset = 0);
>>> 
>>> 
>>> 
>>> Also, the comments say that the return value of write is the actual
>>> number of bytes written.  Is it safe to assume that the return value
>>> will equal len should offset be 0?
>> 
>> I would not make that assumption. It's probably true, but to be robust
>> you should handle the case where the receiving buffer is full and will
>> only accept a partial write.
>> 
>>> If so, when you are simply trying to copy data from upstream to
>>> downstream and you subsequently make a calls such as:
>>> 
>>> /* Copy the data from the read buffer to the output buffer. */
>>> bytesWritten = TSIOBufferCopy(TSVIOBufferGet(data->output_vio),
>>>                                     TSVIOReaderGet(input_vio), towrite, 0);
>>> 
>>> /* Tell the read buffer that we have read 'towrite' bytes of data
>>> * and are no longer interested in it.
>>> */
>>> TSIOBufferReaderConsume(TSVIOReaderGet(input_vio), bytesWritten);  //
>>> Should second param be bytesWritten or should it be towrite?
>> 
>> Definitely bytesWritten. It's more correct and no harder to implement :)
>> 
>>> /* Increment the input VIO nDone value to reflect how much
>>> * data we've copied from the upstream and written to the
>>> * downstream.
>>> */
>>> TSVIONDoneSet(input_vio, TSVIONDoneGet(input_vio) + bytesWritten); //
>>> Should Second param be TSVIONDoneGet(input_vio) + bytesWritten or
>>> TSVIONDoneGet(input_vio) + towrite?
>> 
>> Again, I'd recommend using bytesWritten.
>> 
>> We have been slowly adding man pages in the docs/sdk directory in the
>> source tree. I will add something for the IOBuffer API. I have found the
>> v2 documentation <http://people.apache.org/~zwoop/ats/15_sdk_pg.pdf> very
>> helpful. It's still largely accurate.
>> 
>> J
>

Re: Piercing the veil

Reply via email to