Hi Sebastian,
Thank you for your contibution, the code looks good. Just a little comment for future performance improve, "fmov w12, s2" are expensive because data across Neon and Integer fields, especally it is inside the loop. There are also some deep-seated data organization and algorithm problems, for example, we spends many instructions for absCoeff[numNonZero], if we allow spare zeros inside of array, we will reduce many of instructions. Regards, Min Chen At 2022-03-02 07:28:15, "Pop, Sebastian" <s...@amazon.com> wrote: Hi, the attached patch fixes the registration of costCoeffNxN function hook and removes the early return that I used for testing. Sebastian
_______________________________________________ x265-devel mailing list x265-devel@videolan.org https://mailman.videolan.org/listinfo/x265-devel