On Sat, Apr 25, 2020 at 7:24 AM Kevin Kofler <kevin.kof...@chello.at> wrote:

> Richard Shaw wrote:
> > As far as LCPNet itself I've communicated with the primary developer
> quite
> > a bit over the last week. LPCNet *will not work* without optimizations
> (at
> > least not in real time which is the point).
>
> Has anyone (upstream or elsewhere) ever looked into doing an SSE2 version
> of
> the vector code? It should be faster than scalar (especially considering
> that the "scalar" floating-point code (under the default -mfpmath=sse)
> actually loads everything into SSE2 registers as well, but does not
> actually
> make use of the vectorization) and it would match the baseline of many
> distributions and upstreams out there.
>

It's funny we just had this conversation yesterday, I woke up to a pull
request to add SSE support.

https://github.com/drowe67/LPCNet/pull/25

TL;DL version. On my Ryzen 5 2600, SSE4.1 barely improved performance with
the current LPCNet code. The good news is a beefy processor can perform
better than real time without optimizations, but that can't be assumed for
everyone. There will be people wanted to run this software on lower end
laptops which can't keep up in real time.

Below is a quick table from the PR showing relative decode performance per
SIMD pathway:


   - Fedora 31
   - gcc 9.3.1
   - Ryzen 5 2600

SIMDTime (s)% real time
None 19.796 39.8%
SSE 4.1 17.971 36.1%
AVX 10.185 20.5%
AVX2 9.459 19.0%
Thanks,
Richard
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

Reply via email to