This is part of a convolution filter. The result is in the same order of 
magnitude (uint8):
    
    
    let value:int32  = r0[col0].int32     + r0[col1].int32 * 2 + r0[col2].int32 
+
                       r1[col0].int32 * 2 + r1[col1].int32 * 4 + r1[col2].int32 
* 2 +
                       r2[col0].int32     + r2[col1].int32 * 2 + r2[col2].int32
    w1[col1] = (value  / 16.0).uint8
    
    Run

I am looking how to improve performance-wise without entering into SIMD stuff 
(that I have never used by the way). I think that all those type conversions 
are killing the performance that I should achieve. 

Reply via email to