[My apologies if anyone gets this twice...the original one looks to me 
to have fallen into the bit-bucket.]

Hi,

Taking off from something Marc-André mentioned recently, I'd like to 
propose a new set of utilities that the whole freerdp codebase can make 
use of to accelerate basic operations over arrays of data, whether it is 
copies, shifts, adds, initialization, etc.

At this point I'm just thinking of defining the interface level and 
implementing the basics in C.   As time goes on, all of the operations 
can be optimized, using such strategies as pulling in SSE or NEON 
optimized code; threads; external optimized libraries such as Intel 
Integrated Performance Primitives, liborc, parallel patterns library, 
OpenMax DL; or even GPU-optimizations via OpenCL.   But those details 
should be invisible to the general freerdp programmer who just wants to 
say, "do this operation as fast as possible" and not worry about the 
details.

To keep the namespace simple, the calls would all take the form:

    
dataop_<operation>_<dest_data_type>[_<op_data_type1>][_<op_data_type2>]...[_<details>]

where the data types are standard freerdp data types such as sint16 and 
uint32, puint8 for a pointer to an array of uint8 values, or something 
else relevant.

Thus, as examples...

  * dataop_copy_ptr_ptr(dst, src, count) does the same thing as memcpy.
  * dataop_set_puint8(dst, val, count)  does the same thing as memset
  * dataop_set_puint32(dst, val, count) sets a block of data to a uint32
    value
  * dataop_blend_puint32_puint32_bgra(dst, src, count) does alpha
    blending of the source into the destination.
  * dataop_shift_psint16(dst, shifts, count) does a shift of the "count"
    signed 16-bit values in dst, left if positive, right if negative.
  * dataop_add_psint8_psint8(dst, src, count) adds one array of sint8
    values to another.
  * dataop_rectcopy_puint32_puint32(dst, src, dstrect, srcpt, dstwidth,
    srcwidth) copies a block of 32-bit pixels from one buffer to
    another, or from one spot in a single buffer to another
  * etc.

All of the functions should be written to be thread-safe.  I'm open to a 
different prefix than "dataop_".

I thought of passing a dataop_context parameter around that could 
(privately) contain state data about the operations, e.g. function 
pointers and such, but it's hard to get at a single data field from all 
parts of the code, so I think it's better just to have the dataop 
functions keep any state they need as a static internal variable 
available only to the dataop functions.

  * dataop_init() would be a new call, exercised early in freerdp
    startup, that would let the dataop code do whatever initialization
    it deemed necessary.  This might be nothing at first, but eventually
    it might do things like test levels of SSE support and dynamically
    pick which optimized routine to hook up to calls, test for GPU
    support, initialize external library calls etc.  It might even
    dynamically benchmark a couple operations to pick between methods.
  * dataop_cleanup() would let the dataop functions clean up any memory
    they had allocated or shut things down cleanly.

At this point, the goal isn't to come up with the fastest possible 
implementation of the functions, but rather to make them available so 
they can start to be used in new and rewritten code.  The actual 
optimization of the functions can happen in parallel and independently.

I'd also like to add a unit test function that you could run on any 
system and get speed measurements for all of the functions.

Comments?

Daryl


------------------------------------------------------------------------------
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Freerdp-devel mailing list
Freerdp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freerdp-devel

Reply via email to