Re: [eigen] Vectorization for general use

Rasmus Munk Larsen Sun, 20 Sep 2020 16:12:58 -0700

+1 I agree that AVX is probably a good compromise. (I'm running an old Ivy
Bridge i5 at home.)


On Sun, Sep 20, 2020 at 3:34 PM Patrik Huber <[email protected]> wrote:

> It might be worth adding that AVX has been around for a long time - it has
> been supported by Intel and AMD CPUs since 2011 (see
> https://en.wikipedia.org/wiki/Advanced_Vector_Extensions). AVX2 came to
> CPU generations in 2013. You might want to check or guess how many people
> really run an older CPU. If it is a fairly compute-heavy application,
> chances are that users won't have much fun with it anyway on older CPUs.
>
> That being said, in particular the Sandy Bridge (e.g. i5-25X0K, i7-2700K)
> and Ivy Bridge (e.g. i5-3550) were extremely popular CPUs and are probably
> still widely used. I myself have a i5-3550 and it runs everything
> perfectly, so I don't have a real reason to upgrade even that 7 year old
> CPU. So I would not go as far as assuming that the majority of your user's
> CPUs would support AVX2 - but it might be true for AVX.
>
> One useful data point: Check the Steam Hardware Survey
> https://store.steampowered.com/hwsurvey/, scroll down to "Other
> Settings". According to that, as of Aug 2020, 93% of Steam users have CPUs
> supporting AVX, and 77% AVX2. This is likely biased towards gaming
> computers out there, but should be fairly representative still and I doubt
> you'll find better data.
>
> Best wishes,
> Patrik
>
>
> On Sat, 19 Sep 2020 at 22:30, Sripathi, Vamsi <[email protected]>
> wrote:
>
>> Another option to consider is to use a Compiler that supports function
>> multi-versioning based on ISA. For e.g., Intel Compiler has -ax flag –
>> https://software.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference/compiler-options/compiler-option-details/code-generation-options/ax-qax.html#ax-qax
>>
>>
>>
>> GCC seems to support this through attribute directives -
>> https://lwn.net/Articles/691932/
>>
>>
>>
>> -Vamsi
>>
>>
>>
>> *From:* Rob McDonald <[email protected]>
>> *Sent:* Friday, September 18, 2020 10:19 AM
>> *To:* [email protected]
>> *Subject:* Re: [eigen] Vectorization for general use
>>
>>
>>
>> Thanks for everyone's responses and links.  Very helpful.
>>
>>
>>
>> This seems like it is quite a thorny issue...  It really makes using
>> these advanced features fairly challenging.
>>
>>
>>
>> I'm not sure that it is practical for me to separate out a shared library
>> to be selectively loaded (vs. just separate executables).  Although the
>> algorithms may be somewhat contained, the data structures can have quite
>> wide reach.  It isn't obvious how to separate what needs to be compiled
>> with these flags and what does not (particularly since we didn't design for
>> this from the start).  This is also a case where Eigen being a header-only
>> library is a bit of a drawback.  If it was a traditional compiled library,
>> it would likely be easier to draw the line at eigen_sse.so, eigen_avx1.so
>> or whatever.
>>
>>
>>
>> My project builds with CMake, which isn't very friendly at using
>> different toolchains for different parts of the project -- or compiling the
>> same part multiple times.  It is possible, but not particularly pretty.
>>
>>
>>
>> Rob
>>
>>
>>
>>
>>
>> On Thu, Sep 17, 2020 at 11:09 PM William Tambellini <[email protected]>
>> wrote:
>>
>> A solution :
>>
>>    - do all the math/algos outside the main, in a dynamic libs (.so,
>>    .dll, ...)
>>    - build multiple dyn libs for the ISA you care about (sse.so,
>>    avx1.so, avx2.so, avx512.so, ... )
>>    - dynamic loading the right lib from the main according to the
>>    features of the current running deployed cpu: (
>>    https://github.com/google/cpu_features)
>>    - calling your api in the lib from the main to let the backends run
>>    the algo with the best optim
>>
>> Now, I have the feeling that the long term solution would be for eigen to
>> do a minimum of JIT. Example: oneDNN with asmjit :
>> https://github.com/asmjit/asmjit
>>
>> Kind
>>
>> W.
>>
>>
>>
>> <https://www.sdl.com/>
>>
>> *Share your*
>> *feedback with us* <https://www.surveymonkey.com/r/PYF190816>
>>
>>
>> ------------------------------
>>
>> *From:* Edward Lam <[email protected]>
>> *Sent:* Thursday, September 17, 2020 9:24 PM
>> *To:* [email protected] <[email protected]>
>> *Subject:* Re: [eigen] Vectorization for general use
>>
>>
>>
>> Offhand, I wonder if you could put main() in its own source file and
>> compile it without any vectorization compiler options, and have that call
>> your real main() renamed in a different source file that does have
>> vectorization compiler options enabled. Then your new main() could do CPUID
>> checks (eg. https://stackoverflow.com/a/4823889 ) and bail out
>> gracefully. You will of course need to ensure that the CPUID checks are
>> accurate for your compiler options, which may present its own challenges.
>>
>>
>>
>> Cheers,
>>
>> -Edward
>>
>>
>>
>> On Thu, Sep 17, 2020 at 10:52 PM Rob McDonald <[email protected]>
>> wrote:
>>
>> I maintain an open source program that uses Eigen.  The vast majority of
>> my users do not compile the program, instead downloading a pre-compiled
>> binary from our website.  About 80% are on Windows, 10% on Mac and 10% on
>> Linux.  I only provide X86 builds, 32 and 64-bit on Windows, 64-bit only on
>> Mac and Linux.  We may eliminate the 32-bit Windows build soon.
>>
>>
>>
>> Historically, I have compiled with no special flags enabling
>> vectorization options for the CPU.  I would like to pursue this as I expect
>> it will unlock some nice performance gains.  However, I'd like to keep
>> things simple and compatible for users.
>>
>>
>>
>> What happens when someone runs a program compiled with vectorization when
>> their CPU does not support it?  If it fails, how graceful is the failure?
>>
>>
>>
>> Is there a standard approach to identify the capabilities of a given
>> machine?  I could add that to my program and survey users before making a
>> change...  Would such code still run on a machine that was in the process
>> of failing due to not having support for the built in vectorization?  I.e.
>> if it is crashing, can we send a message as to why we're going down?
>>
>>
>>
>> Is there a graceful way to support multiple options?
>>
>>
>>
>> Any tips from other broad use applications is greatly appreciated.
>>
>>
>>
>> Rob
>>
>>
>>
>>
>>
>>
>>
>> Click here
>> <https://www.mailcontrol.com/sr/IDXDiOSqylnGX2PQPOmvUhe0y89-yNqhZAviLmkDXL06gGw831_8qiYaAxJOEWVK7LHzKdJh-eoDMGoTToeXlw==>
>> to report this email as spam.
>>
>>

Re: [eigen] Vectorization for general use

Reply via email to