On 30 July 2017 at 02:30, James Almer wrote:
>
>
> Maybe poke Hendrik for his opinion, but it seems to work, so LGTM.
>
>
Managed to simplify the code and the crazy alignment requirements alot by
just iterating over the buffer in reverse. There the overlapping in the
middle solved itself by writi
On 7/29/2017 9:48 PM, Rostislav Pehlivanov wrote:
> Speeds up decoding by 8% in total in the avx2 case.
>
> 20ms frames:
> Before (c): 17774 decicycles in postrotate, 262065 runs, 79 skips
> After (sse3): 9624 decicycles in postrotate, 262113 runs, 31 skips
> After (avx2): 7169 de
Speeds up decoding by 8% in total in the avx2 case.
20ms frames:
Before (c): 17774 decicycles in postrotate, 262065 runs, 79 skips
After (sse3): 9624 decicycles in postrotate, 262113 runs, 31 skips
After (avx2): 7169 decicycles in postrotate, 262104 runs, 40 skips
10ms frame