@pabs

> I suggest you compile all accelerated versions of the code for the
> target platform (for eg no SSE on powerpc) into the main binary and
> use FMV to automatically select the correct one at runtime:
>
> https://lwn.net/Articles/691932/

Thanks for the link!
While this is tempting, I'd like to keep my code as general as possible. Also, I prefer compiling with clang++, which, for my use case, produces faster code, and clang++ does not support constructs like

__attribute__((target_clones("avx2","arch=atom","default")))

Additionally, I find the notion of target_clones to compile a specific 'function' hard to digest. What do they mean, function? I write generic code in C++, most of my stuff is inlined from header-only libraries, it's all object-oriented. There are next to no 'functions' in my code.

@wookey

> There is (as yet) no mechanism in packing to select packages by
> hardware variant or optimisation. It has been mooted, and could be
> done, but it's a big job, which would take years to roll out, and
> no-one has stepped up to make it work. So for now your favourite
> mechanism is not possible.

I had something simpler in mind. I had hoped that a debian package would provide some sort of target-side script which is run when the application deploys with the user. Then it would be easy to have a bit of code à la

#! /bin/bash

for instruction_set in mmx sse sse2 sse3 ssse3 sse4 sse4a sse4.1 sse4.2 avx avx2 avx512f avx512pf avx512er avx512cd
do
  if [[ $( lscpu | grep $instruction_set ) ]]
  then
    bestarch=$instruction_set
  fi
done

mv myprogram_$bestarch $target_bin/myprogram

... and optionally delete the binaries for the other ISAs.

As you can see I have no clue as to what a debian package can or cannot do... :(

> Does this software only work on x86 or does it work on other
> architectures, with other vector units (neon, altivec)? Remember to
> consider more than just x86 when pondering this issue.

I am using Vc, so whatever Vc supports, my software supports as well. Vc is a generic C++ library to abstract away the architecture, so I just compile with -mmx -sse or whatever and Vc adapts and produces the code for the specific target. I've coded so that my program will also run without using the vector units, this is done by simple #ifdefs, and I pass the definition in via a -D directive (or I don't, to produce code without vectorization). Alternatively Vc can produce a scalar interpretation of the vector code which runs on any platform and is equivalent.

Am 10.05.2017 um 13:56 schrieb Christian Seiler:

> On 05/10/2017 11:52 AM, Wookey wrote:
>> Debian requires packages to run on the base level ISA defined for each
>> architecture (which does change slowly over time).
>
> Well, kind of. What Debian requires is that if it is at all feasible
> software should run on the base ISA - which in practice means that
> very often the software is only compiled for the base ISA itself,
> resulting in the binaries being slower than they need to be on more
> modern hardware.

So with simply switching off vectorization, I can easily provide some base-level code which should run over a wide selection of targets. But that's really not the point of my software: I am doing real-time geometric transformations of images at 50/60Hz, and the base-level code without using the vector units can barely cope.

> ... just not a RC bug.

pardon me, but what's an RC bug?

> With that all out of the way: if a package does support being
> compiled for the base ISA, or a patch to make it work is trivial,
> then it would be considered RC (on release archs at least) not
> supporting the base ISA in the compiled package. What's not required
> is to also compile optimized versions that run faster on newer
> hardware - but in an ideal world one would also like to do that.

So, yes, I can create base-level machine code and machine code for every platform Vc supports. Which still leaves me with the open question how best to proceed.

Kay


Reply via email to