Re: What CPU extensions can we assume are available by arch?

2020-04-28 Thread Steven Munroe
>Kevin Kofler wrote:
>>Dave Love wrote:
>> As far as I can tell, it doesn't do dynamic dispatch.  Experimentally,
>> if you build with -mavx2 and run on ivybridge (or for skx and run on
>> haswell), the test code generates illegal instruction errors.  It's some
>> time ago since I looked at it, and I don't remember, but that's probably
>> why I thought it wasn't very useful, apart from only having x86 and arm
>> support.
>
>That's because it is a header-only library. It is either hard (and non-
>portable) or impossible to do dynamic dispatch in a header-only library,
>depending on the compiler and compiler version. (At least in past GCC
>versions, it was impossible. This may or may not have improved since.)
>Hence, dynamic dispatch, if desired, is left to the client application.
And
>I agree that this makes the library mostly useless.

Less useful then we would like. But could be part of the larger solution.
Properly structured they could help (not impede) building shared objects
with multiple implementations and dynamic selection for vector codes.

>You need an actual shared object with actual functions to do dynamic
>dispatch nicely and transparently.

I think the problem is lack of good documentation and examples. And perhaps
motivation. GLIBC has been doing this some time, but at the time is was
hard and difficult. But the compiler and runtime infrastructure has
improved since. Perhaps we can make this simple enough for wide application
to higher level libraries.

I am finding for the implementation of pveclib, that vector multiple
quadword (128-bit) precision multiplies are large enough, that you don't
want to expanded them in-line. But the vector multiply quadword itself and
the operations that implement it can and should be inline (for ppc64le). So
I was forced to cross that boundary.

Give me some time and I will have a documentation/examples that might
general useful.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-28 Thread Kevin Kofler
Dave Love wrote:
> As far as I can tell, it doesn't do dynamic dispatch.  Experimentally,
> if you build with -mavx2 and run on ivybridge (or for skx and run on
> haswell), the test code generates illegal instruction errors.  It's some
> time ago since I looked at it, and I don't remember, but that's probably
> why I thought it wasn't very useful, apart from only having x86 and arm
> support.

That's because it is a header-only library. It is either hard (and non-
portable) or impossible to do dynamic dispatch in a header-only library, 
depending on the compiler and compiler version. (At least in past GCC 
versions, it was impossible. This may or may not have improved since.) 
Hence, dynamic dispatch, if desired, is left to the client application. And 
I agree that this makes the library mostly useless.

You need an actual shared object with actual functions to do dynamic 
dispatch nicely and transparently.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Steven Munroe
Dan writes:
>> Dave Love writes:
>> Well, what I've already said from some experience and research. Where's the 
>> POWER and >>S390 support? All I saw is x86 and arm. We've heard there's 
>> ppc64le compatibility support
>>anyhow, which is rising
>
>gcc provides compat x86 intrinsics via "x86intrin.h", they were successfully 
>used eg.
>by darktable (raw photo sw)

Yes there are some, looks like GCC8 and later. This is what is in GCC9:
$ find /opt/at13.0/ -name '*intrin.h'
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/bmi2intrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/mmintrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/x86intrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/htmintrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/pmmintrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/htmxlintrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/xmmintrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/tmmintrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/emmintrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/bmiintrin.h
/opt/at13.0/lib/gcc/powerpc64le-linux-gnu/9.2.1/include/smmintrin.h

So most of SSE, but not AVX yet.

If course this is compromise. May not be as optimal of a native
PowerISA implementation on POWER (CISC vs RISC out-of-order,
super-scalar ). This is what PVECLIB is for.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Steven Munroe
Kevin Kofler wrote

>>Florian Weimer wrote:
>>...
>> In general, that is not true. You can use the target function attribute
and function
>> multiversioning instead in many cases.

>This may (or may not) have been recently fixed, but last I checked,
intrinsics for a vector
>instruction set XYZ2 were only available with the corresponding -mxyz2
flag, which could
>only be set per compilation unit (attempting to use the recently
introduced flag pragmas for
>this purpose was not supported either), and which would lead to GCC in
some cases
>(out of the library programmer's control) using XYZ2 instructions also in
other functions.
>As a result, the only safe way was to put all XYZ2 code in a separate
compilation unit
>compiled with -mxyz2 (and only that compilation unit can use that flag).
Has that situation
> mproved recently (and how so)?
>
>And that is just for C or non-template C++ code. C++
> template instantiations make the situation even messier.

Yes still true. I encounter this implement vec_int512_ppc.h for pveclib.
There a easy to use work around described here Putting the library in
PVECLIB. I
will not claim this is elegant but it does work.

My goal (pveclib) is to make it easier to write vector codes that CAN be
used in functions that CAN be built for multiple processor targets and
dynamic selection. My case is simpler then x86 (and SIMDE) as for ppc64le I
only have to deal with power8/9.

This a hard problem, and not solution is ease or perfect. But we can improve
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Jun Aruga
> gcc provides compat x86 intrinsics via "x86intrin.h", they were
> successfully used eg. by darktable (raw photo sw)

Okay. Thanks for the info.

-- 
Jun | He - His - Him
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Jun Aruga
> They're packages for a header-only system, not libraries, hence my
confusion.

Hi Dave, okay.

> Who's pushing this, and how do I present an HPC performance engineering
perspective?

I appreciate your interest in the bioinformatics in HPC.
The simde Fedora package: Me
The simde Debian package: Michael  (You can see his account in the
simde project. The author of the Common Workflow Language).

If you want to discuss something related to simde, opening the ticket
on the upstream might be good.
https://github.com/nemequ/simde

My target is to enable the set of bio tools RPMs used in COVID-19
analysis on ppc64le and aarch64 based HPC use cases.
As you know there are ppc64le and aarch64 based HPC where it's hard to
run some of the bio-tools.

> Well, what I've already said from some experience and research.  Where's
> the POWER and S390 support?  All I saw is x86 and arm.  We've heard
> there's ppc64le compatibility support anyhow, which is rising up my list
> to investigate for work.  Who's going to do bioinformatics on Blue
> Boxes?  Bioinf causes enough trouble on x86...

bowtie2
* POWER: supported from version 2.4.0.
  > Added preliminary support for ppc64le architectures with the help
of SIMDE project
  
https://github.com/BenLangmead/bowtie2/blob/b8c13c0401885873a99b229fca668deecde08749/NEWS#L46-L47
  Travis CI ppc64le job:
https://github.com/BenLangmead/bowtie2/blob/master/.travis.yml#L75
* s390x: Not supported yet. Some tests failed.
  https://github.com/BenLangmead/bowtie2/issues/286

minimap2
https://github.com/lh3/minimap2
There is no official support yet. But a little discussion about it.
You see Makefile.simde is added on the latest commits.
That means we can try to build it as potentially. I am planning to
contribute to the upstream for that.

> Who's going to do bioinformatics on Blue Boxes?  Bioinf causes enough trouble 
> on x86...

What is Blue Boxes and Bioinf?

> There's been an issue for bowtie2 for years, and I've long had it in
> copr, though it probably needs updating there.  Has someone done the
> unresponsive maintainer thing for Adam Huffman (who is, or was, back
> more in that area)?

You can pick up the bowtie2 latest version x86 and arm on my copr
repository here.
Copr does not have a ppc64le build root right now.
https://copr.fedorainfracloud.org/coprs/jaruga/staging/packages/
Or from my scratch build to pick up the ppc64le version.
https://koji.fedoraproject.org/koji/taskinfo?taskID=43660524

As I said, bowtie2 is in review status for the Fedora dist-git from last Friday.
I sometimes work with Adam for bioinformatics tools. He is busy with
his main job to respond.
I have been co-maintainer of some of his RPM packages such as
samtools, bwa, bowtie, working on it instead of him.

If you are interested in bioinformatics, DNA/RNA sequencing in Fedora,
Fedora medical-sig is the place. We can talk there.
You may see some information that you are interested in there.
https://fedoraproject.org/wiki/SIGs/Medical

Cheers,
Jun

-- 
Jun | He - His - Him
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Dan Horák
On Mon, 27 Apr 2020 16:23:39 +0100
Dave Love  wrote:

> Jun Aruga  writes:
> 
> >> > For the better user experience, we are promoting the system
> >> > library for both RPM and Debian based Linux distributions.
> >>
> >> What system library, for what purpose?
> >
> > Hi Dave,
> > The simde system library is
> >
> > * simde-devel in Fedora and EPEL [1]
> > * libsimde-dev in Debian [2]
> 
> They're packages for a header-only system, not libraries, hence my
> confusion.
> 
> Who's pushing this, and how do I present an HPC performance
> engineering perspective?
> 
> > The purpose is to build and ship SIMD implemented software RPM such
> > as bowite2 [3] and minimap2 [4] on non-x86_64 CPU architecture such
> > as aarch64, ppc64le and s390x.
> 
> Well, what I've already said from some experience and research.
> Where's the POWER and S390 support?  All I saw is x86 and arm.  We've
> heard there's ppc64le compatibility support anyhow, which is rising

gcc provides compat x86 intrinsics via "x86intrin.h", they were
successfully used eg. by darktable (raw photo sw)


Dan

> up my list to investigate for work.  Who's going to do bioinformatics
> on Blue Boxes?  Bioinf causes enough trouble on x86...
> 
> There's been an issue for bowtie2 for years, and I've long had it in
> copr, though it probably needs updating there.  Has someone done the
> unresponsive maintainer thing for Adam Huffman (who is, or was, back
> more in that area)?
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List
> Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Dave Love
Jun Aruga  writes:

>> > For the better user experience, we are promoting the system library
>> > for both RPM and Debian based Linux distributions.
>>
>> What system library, for what purpose?
>
> Hi Dave,
> The simde system library is
>
> * simde-devel in Fedora and EPEL [1]
> * libsimde-dev in Debian [2]

They're packages for a header-only system, not libraries, hence my
confusion.

Who's pushing this, and how do I present an HPC performance engineering
perspective?

> The purpose is to build and ship SIMD implemented software RPM such as
> bowite2 [3] and minimap2 [4] on non-x86_64 CPU architecture such as
> aarch64, ppc64le and s390x.

Well, what I've already said from some experience and research.  Where's
the POWER and S390 support?  All I saw is x86 and arm.  We've heard
there's ppc64le compatibility support anyhow, which is rising up my list
to investigate for work.  Who's going to do bioinformatics on Blue
Boxes?  Bioinf causes enough trouble on x86...

There's been an issue for bowtie2 for years, and I've long had it in
copr, though it probably needs updating there.  Has someone done the
unresponsive maintainer thing for Adam Huffman (who is, or was, back
more in that area)?
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Dave Love
Kevin Kofler  writes:

> Florian Weimer wrote:
>> In general, that is not true.  You can use the target function attribute
>> and function multiversioning instead in many cases.
>
> This may (or may not) have been recently fixed, but last I checked, 
> intrinsics for a vector instruction set XYZ2 were only available with the 
> corresponding -mxyz2 flag, which could only be set per compilation unit 
> (attempting to use the recently introduced flag pragmas for this purpose was 
> not supported either), and which would lead to GCC in some cases (out of the 
> library programmer's control) using XYZ2 instructions also in other 
> functions. As a result, the only safe way was to put all XYZ2 code in a 
> separate compilation unit compiled with -mxyz2 (and only that compilation 
> unit can use that flag). Has that situation improved recently (and how so)?

I haven't needed to try it so far, but I'd trust Florian on the topic,
given the GCC doc on function attributes and pragmas.  You might not
have intrinsics anyhow.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Kevin Kofler
Florian Weimer wrote:
> In general, that is not true.  You can use the target function attribute
> and function multiversioning instead in many cases.

This may (or may not) have been recently fixed, but last I checked, 
intrinsics for a vector instruction set XYZ2 were only available with the 
corresponding -mxyz2 flag, which could only be set per compilation unit 
(attempting to use the recently introduced flag pragmas for this purpose was 
not supported either), and which would lead to GCC in some cases (out of the 
library programmer's control) using XYZ2 instructions also in other 
functions. As a result, the only safe way was to put all XYZ2 code in a 
separate compilation unit compiled with -mxyz2 (and only that compilation 
unit can use that flag). Has that situation improved recently (and how so)?

And that is just for C or non-template C++ code. C++ template instantiations 
make the situation even messier.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Richard Shaw
On Mon, Apr 27, 2020 at 7:26 AM Stephen Gallagher 
wrote:

> On Wed, Apr 22, 2020 at 10:52 AM Richard Shaw 
> wrote:
> > Could we treat them like arches?
> >  Something like:
> > X86_64 & X86_64-AVX2
> > armv7hf & armv7fh-NEON
> > etc...
> >
> > It would absolutely have to be OPT IN because once you enable it your
> system is no longer necessarily portable to other hardware.
> >
>
> FYI, you just described a major component of the ELN Compose we are
> working on for Fedora 33:
> https://fedoraproject.org/wiki/Changes/ELN_Buildroot_and_Compose


I had been following the "lengthy" discussions on the list, but I thought
it was just targeting rawhide for some reason. That may work in the future,
so perhaps a less-than-ideal hack would work in the meantime.

Thanks,
Richard
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Florian Weimer
* Kevin Kofler:

> Jun Aruga wrote:
>> simde is a header files only library. It's not a binary.
>
> Then it is inherently unsuitable for automatic runtime selection, and
> all the runtime selection still has to be done in the client program.
>
> GCC requires you to compile for separate vector instruction sets in
> separate compilation units. A header-only library cannot possibly do
> that for you.

In general, that is not true.  You can use the target function attribute
and function multiversioning instead in many cases.

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Stephen Gallagher
On Wed, Apr 22, 2020 at 10:52 AM Richard Shaw  wrote:
>
> On Wed, Apr 22, 2020 at 9:28 AM Artur Iwicki  wrote:
>>
>> Regarding x86_64 and AVX2, last year we had a very heated discussion about 
>> this on the mailing list.
>>
>> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/MPFXH4Y5LDC5L2VXWKUHAX3WAKBQXR4U/#MPFXH4Y5LDC5L2VXWKUHAX3WAKBQXR4U
>>
>> tl;dr: there was a proposal to make "x86_64" in Fedora mean "must support at 
>> least AVX2" and it met with a lot of backlash.
>
>
> Now that you mention it that does tickle some brain cells...
>
> So it seems what's really needed is a method to support software with 
> optimizations above the baseline without leaving other people behind.
>
> The only way I can think of to do that that would be to have optional 
> "enhanced" repos available that people can enable if their system supports 
> it. The hard part would be keeping it in sync with the main repo. It would 
> have to be a parallel build process and similar to the current process if one 
> fails the build is a NOGO across the board.
>
> Could we treat them like arches?
>  Something like:
> X86_64 & X86_64-AVX2
> armv7hf & armv7fh-NEON
> etc...
>
> It would absolutely have to be OPT IN because once you enable it your system 
> is no longer necessarily portable to other hardware.
>

FYI, you just described a major component of the ELN Compose we are
working on for Fedora 33:
https://fedoraproject.org/wiki/Changes/ELN_Buildroot_and_Compose
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Jun Aruga
> > For the better user experience, we are promoting the system library
> > for both RPM and Debian based Linux distributions.
>
> What system library, for what purpose?

Hi Dave,
The simde system library is

* simde-devel in Fedora and EPEL [1]
* libsimde-dev in Debian [2]

The purpose is to build and ship SIMD implemented software RPM such as
bowite2 [3] and minimap2 [4] on non-x86_64 CPU architecture such as
aarch64, ppc64le and s390x.

[1] https://src.fedoraproject.org/rpms/simde/
[2] https://wiki.debian.org/SIMDEverywhere
[3] 
https://raw.githubusercontent.com/junaruga/fedora-bowtie2/master/bowtie2.spec
 In review status on https://bugzilla.redhat.com/show_bug.cgi?id=1824348 .
[4] https://github.com/lh3/minimap2

-- 
Jun | He - His - Him
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Dave Love
Jun Aruga  writes:

> For the better user experience, we are promoting the system library
> for both RPM and Debian based Linux distributions.

What system library, for what purpose?
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Dave Love
Jun Aruga  writes:

> You do not need to care about the availability by arch or compiler
> when using this library.

As far as I can tell, it doesn't do dynamic dispatch.  Experimentally,
if you build with -mavx2 and run on ivybridge (or for skx and run on
haswell), the test code generates illegal instruction errors.  It's some
time ago since I looked at it, and I don't remember, but that's probably
why I thought it wasn't very useful, apart from only having x86 and arm
support.

Note also that there's typically rather more than SIMD to consider for
different micro-architectures, particularly fitting the memory
hierarchy.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Kevin Kofler
Jun Aruga wrote:
> simde is a header files only library. It's not a binary.

Then it is inherently unsuitable for automatic runtime selection, and all 
the runtime selection still has to be done in the client program.

GCC requires you to compile for separate vector instruction sets in separate 
compilation units. A header-only library cannot possibly do that for you.

(This is an issue for all header-only vectorization libraries, e.g., Eigen.)

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-27 Thread Kevin Kofler
Richard Shaw wrote:
> It's funny we just had this conversation yesterday, I woke up to a pull
> request to add SSE support.
> 
> https://github.com/drowe67/LPCNet/pull/25

Unfortunately, that requires SSE4.1, which is still too much to assume.

Looking at the comments, the author actually tried SSE2, but it was too slow 
for real-time operation on his machines. Though really fast CPUs (such as 
your Ryzen 5 2600) can actually even run the unoptimized C in real time, but 
those are normally modern ones that also support the AVX and even AVX2 
optimizations (yours does), so it does not help us that much either.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-26 Thread Richard Shaw
On Sun, Apr 26, 2020 at 5:25 PM Dave Love 
wrote:

> Kevin Kofler  writes:
>
> > Has anyone (upstream or elsewhere) ever looked into doing an SSE2
> version of
> > the vector code? It should be faster than scalar (especially considering
> > that the "scalar" floating-point code (under the default -mfpmath=sse)
> > actually loads everything into SSE2 registers as well, but does not
> actually
> > make use of the vectorization) and it would match the baseline of many
> > distributions and upstreams out there.
>
> What's preventing vectorization with sse2 (or other architecture' base
> SIMD) anyhow, if anything?  Use something
> like
>
> gcc -Ofast -fopt-info-vec-missed
>

I can't comment on the exact command line used, but I did experiment with a
recent pull request adding SSE 4.1

Full details here:

https://github.com/drowe67/LPCNet/pull/25

Thanks,
Richard
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-26 Thread Steven Munroe
> Again it's not the binaries. People need to build with it. But do not need to 
> distribute it for
> each project. For the better user experience, we are promoting the system 
> library for both
> RPM and Debian based Linux distributions.

Again compiling with an "enhanced API" like simde or pveclib is only
the first step to delivering a optimized package library (LPCNet in
this example). Most application developers do not rebuild from source.
They want the package installed from distro to be optimized for their
platform.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-26 Thread Dave Love
Kevin Kofler  writes:

> Has anyone (upstream or elsewhere) ever looked into doing an SSE2 version of 
> the vector code? It should be faster than scalar (especially considering 
> that the "scalar" floating-point code (under the default -mfpmath=sse) 
> actually loads everything into SSE2 registers as well, but does not actually 
> make use of the vectorization) and it would match the baseline of many 
> distributions and upstreams out there.

What's preventing vectorization with sse2 (or other architecture' base
SIMD) anyhow, if anything?  Use something
like

gcc -Ofast -fopt-info-vec-missed

on the performance-critical parts for clues.  Use -Ofast if you don't
care about conforming maths, which you presumably don't, especially if
using NEON.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-26 Thread Dave Love
Richard Shaw  writes:

>> tl;dr: there was a proposal to make "x86_64" in Fedora mean "must support
>> at least AVX2" and it met with a lot of backlash.

Yes, that's the wrong approach to performance engineering.

> Now that you mention it that does tickle some brain cells...
>
> So it seems what's really needed is a method to support software with
> optimizations above the baseline without leaving other people behind.

It's certainly true that you should be able to use optimizations above
the Fedora-defined ones, without reviewers jumping on you for enabling
vectorization and unrolling of numerical code.

> The only way I can think of to do that that would be to have optional
> "enhanced" repos available that people can enable if their system supports
> it. The hard part would be keeping it in sync with the main repo. It would
> have to be a parallel build process and similar to the current process if
> one fails the build is a NOGO across the board.

The way to support different micro-architectures is clearly how things
like OpenBLAS (or BLIS, if it accepted changes for other architectures)
do it, with dispatch on the micro-architecture in the
performance-critical parts.  Those BLASs do it explicitly (but as far as
I know you need to do something like read /proc/cpuinfo on arm32 and s390).

Anyhow, if no-one's going to do that, or assemble different variants
with target attributes, adding the target_clones function attribute to
generic code for the performance-critical routines is likely to be
profitable.  See the GCC info node on function attributes.  (In my
experience GCC vectorizes a lot better than people often claim, so you
get decent SIMD generated for multiple micro-architectures.)

For example, I added target_clones to the plain C GEMM micro-kernel in
BLIS and got ~60% of the performance of the hand-crafted code for DGEMM
on Haswell.  (GEMM is the basis of matrix-matrix linear algebra, if
you're not familiar with it.)
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-26 Thread Jun Aruga
On Sun, Apr 26, 2020 at 9:41 PM Steven Munroe  wrote:
>
> Jun Aruga writes:
>
> > I recommend using the simde (SIMD Everywhere) library for the packaging and 
> > contribution
> > to the upstream. https://github.com/nemequ/simde
>
> This does not help unless the project LPCNet maintainers are willing to built 
> the fat binaries and support dynamic selection. Otherwise every user is 
> rebuilding from source.

simde is a header files only library. It's not a binary. There is no
shared object file in it. The project needs to bundle the simde in the
project or use the system library for the build.

> > You do not need to care about the availability by arch or compiler when 
> > using this library.
>
> This is exactly what the Power Vector Library PVECLIB is doing for ppc64le. 
> PVECLIB provides POWER7/8/9 VMX/VSX operations equivalent to  
> while smoothing over compiler version and platform ISA differences. PVECLIB 
> also provides useful operations beyond the "Power Vector Intrinsic 
> Programming Reference" defined set (like Multiply 128-bit Quadword). Net you 
> can write using power9 operation extensions and still compile for 
> -mcpu=power7/8. PVECLIB provides the appropriate implementation.
>
> Also for GCC (and I think Clang) provide ppc64le equivalent headers to Intel 
> MMX and SSE intrinsic headers (at least for SSE3 and some SSE4). This could 
> be more of a performance compromise (vs PVECLIB) but would be a place to 
> start if you have an existing SSE implementation and want to vectorize 
> ppc64le.

Okay. I did not know the library. Thanks for the info.

> But again someone needs to build and distribute the fat binaries for each 
> project.

Again it's not the binaries. People need to build with it. But do not
need to distribute it for each project.
For the better user experience, we are promoting the system library
for both RPM and Debian based Linux distributions.

-- 
Jun | He - His - Him
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-26 Thread Steven Munroe
Jun Aruga writes:

> I recommend using the simde (SIMD Everywhere) library for the packaging
and contribution
> to the upstream. https://github.com/nemequ/simde

This does not help unless the project LPCNet maintainers are willing to
built the fat binaries and support dynamic selection. Otherwise every user
is rebuilding from source.

> You do not need to care about the availability by arch or compiler when
using this library.

This is exactly what the Power Vector Library PVECLIB
 is doing for ppc64le.
PVECLIB provides POWER7/8/9 VMX/VSX operations equivalent to 
while smoothing over compiler version and platform ISA differences. PVECLIB
also provides useful operations beyond the "Power Vector Intrinsic
Programming Reference" defined set (like Multiply 128-bit Quadword). Net
you can write using power9 operation extensions and still compile for
-mcpu=power7/8. PVECLIB provides the appropriate implementation.

Also for GCC (and I think Clang) provide ppc64le equivalent headers to
Intel MMX and SSE intrinsic headers (at least for SSE3 and some SSE4). This
could be more of a performance compromise (vs PVECLIB) but would be a place
to start if you have an existing SSE implementation and want to vectorize
ppc64le.

But again someone needs to build and distribute the fat binaries for each
project.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-26 Thread Samuel Sieb

On 4/26/20 5:07 AM, Richard Shaw wrote:
Below is a quick table from the PR showing relative decode performance 
per SIMD pathway:


  * Fedora 31
  * gcc 9.3.1
  * Ryzen 5 2600

SIMDTime (s)% real time
None19.796  39.8%
SSE 4.1 17.971  36.1%
AVX 10.185  20.5%
AVX29.459   19.0%


I wonder if the results would be different on an older CPU.  Do you have 
any way to test that?  Would you be able to provide binaries with each 
option for people to test?

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-26 Thread Richard Shaw
On Sat, Apr 25, 2020 at 7:24 AM Kevin Kofler  wrote:

> Richard Shaw wrote:
> > As far as LCPNet itself I've communicated with the primary developer
> quite
> > a bit over the last week. LPCNet *will not work* without optimizations
> (at
> > least not in real time which is the point).
>
> Has anyone (upstream or elsewhere) ever looked into doing an SSE2 version
> of
> the vector code? It should be faster than scalar (especially considering
> that the "scalar" floating-point code (under the default -mfpmath=sse)
> actually loads everything into SSE2 registers as well, but does not
> actually
> make use of the vectorization) and it would match the baseline of many
> distributions and upstreams out there.
>

It's funny we just had this conversation yesterday, I woke up to a pull
request to add SSE support.

https://github.com/drowe67/LPCNet/pull/25

TL;DL version. On my Ryzen 5 2600, SSE4.1 barely improved performance with
the current LPCNet code. The good news is a beefy processor can perform
better than real time without optimizations, but that can't be assumed for
everyone. There will be people wanted to run this software on lower end
laptops which can't keep up in real time.

Below is a quick table from the PR showing relative decode performance per
SIMD pathway:


   - Fedora 31
   - gcc 9.3.1
   - Ryzen 5 2600

SIMDTime (s)% real time
None 19.796 39.8%
SSE 4.1 17.971 36.1%
AVX 10.185 20.5%
AVX2 9.459 19.0%
Thanks,
Richard
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-25 Thread John Reiser

On 4/25/20 12:24 UTC, Kevin Kofler wrote:

Richard Shaw wrote:

As far as LCPNet itself I've communicated with the primary developer quite
a bit over the last week. LPCNet *will not work* without optimizations (at
least not in real time which is the point).


Has anyone (upstream or elsewhere) ever looked into doing an SSE2 version of
the vector code?


Most of LPCNet computation is "embarrassingly parallel"; for each vector 
operation,
then each output element could be computed simultaneously.  So an SSE2 version
would be competitive with the AVX1 version (use the same instructions,
just don't VEX encode them) except for any advantage that AVX gains from
using 3-operand instructions instead of just 2-operand.

The existing code should be enhanced to align each 'float' array to a
cache-line boundary, and to place scalar members of 'struct's into any "holes".
Also, the existing code is single-threaded.  A two-threaded version
with one thread computing vector elements [0, 16*floor(N/32)) and a second
thread computing the rest, would be nearly twice as fast as long as
synchronization was fast [futex to the rescue.]  Two threads might trigger
thermal throttling on older CPUs.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-25 Thread Richard Shaw
On Sat, Apr 25, 2020 at 7:24 AM Kevin Kofler  wrote:

> Richard Shaw wrote:
> > As far as LCPNet itself I've communicated with the primary developer
> quite
> > a bit over the last week. LPCNet *will not work* without optimizations
> (at
> > least not in real time which is the point).
>
> Has anyone (upstream or elsewhere) ever looked into doing an SSE2 version
> of
> the vector code? It should be faster than scalar (especially considering
> that the "scalar" floating-point code (under the default -mfpmath=sse)
> actually loads everything into SSE2 registers as well, but does not
> actually
> make use of the vectorization) and it would match the baseline of many
> distributions and upstreams out there.
>

Well, I think that would be a bit beyond us ham radio guys, the version
we're using is a fork of the main project:

https://github.com/mozilla/LPCNet

Which only provides AVX, AVX2, and NEON options. I don't think the NEON one
is even fast enough (due the the hardware that uses it more than the
instruction set).



> > I think we're going to work towards a static build option, we just have
> to
> > figure out the mechanics. The main application does check for AVX/AVX2
> and
> > disables the 2020 mode if it's not available.
>
> Well, if the application does the runtime detection, I guess you can get
> away with relying on that. The problems will start if some other
> application
> comes and wants to use the library unconditionally.
>

I think the direction we're heading is to include it in the package of the
main FreeDV program. This fork is tightly connected to FreeDV/codec2.
Planning either a static build or private library.

Thanks,
Richard
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-25 Thread Jun Aruga
Hi Rechard,

I recommend using the simde (SIMD Everywhere) library for the
packaging and contribution to the upstream.
https://github.com/nemequ/simde

You do not need to care about the availability by arch or compiler
when using this library.

simde-devel is available in Fedora stable versions now. [1]
libsimde-dev is available in Debian, hopefully soon available to the
stable versions and Ubuntu. [2]

Here are the actual examples 1 [3] and 2 [4] implementing it.

[1] https://src.fedoraproject.org/rpms/simde/
[2] https://wiki.debian.org/SIMDEverywhere
[3] https://github.com/lh3/minimap2/pull/597
[4] https://github.com/BenLangmead/bowtie2/blob/master/sse_wrap.h

-- 
Jun | He - His - Him
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-25 Thread Kevin Kofler
Richard Shaw wrote:
> As far as LCPNet itself I've communicated with the primary developer quite
> a bit over the last week. LPCNet *will not work* without optimizations (at
> least not in real time which is the point).

Has anyone (upstream or elsewhere) ever looked into doing an SSE2 version of 
the vector code? It should be faster than scalar (especially considering 
that the "scalar" floating-point code (under the default -mfpmath=sse) 
actually loads everything into SSE2 registers as well, but does not actually 
make use of the vectorization) and it would match the baseline of many 
distributions and upstreams out there.

That said, of course, it would still be slower than AVX and hence possibly 
still too slow for real time (also considering that pre-AVX CPUs are 
typically slower to begin with). I guess it would have to be tried to see 
whether it can work.

> I think we're going to work towards a static build option, we just have to
> figure out the mechanics. The main application does check for AVX/AVX2 and
> disables the 2020 mode if it's not available.

Well, if the application does the runtime detection, I guess you can get 
away with relying on that. The problems will start if some other application 
comes and wants to use the library unconditionally.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-25 Thread Kevin Kofler
Sam Varshavchik wrote:
> Linux thinkpad 5.5.16-200.fc31.x86_64 #1 SMP Wed Apr 8 16:43:33 UTC 2020
> x86_64 x86_64 x86_64 GNU/Linux
> [
> /proc/cpuinfo:
> 
> flags   : ... avx ...

And even that is not safe to assume.

The baseline is SSE2, nothing more. No SSE3, SSSE3, SSE4.1, SSE4.2, AVX(1), 
AVX2, and most definitely no AVX-512 (none of the defined subsets of it). 
For any of these, CPU support MUST be detected at runtime before attempting 
to use them.

Kevin Kofler
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-25 Thread Richard Shaw
On Fri, Apr 24, 2020 at 6:10 PM Steven Munroe  wrote:

>
> The Linux kernel notified each process of the Platform and Hardware
> Capabilities it the AUX Vector (Defined in the Application Binary
> Interface Document for each platform). The compilers provide a easy to
> use interface to interrogate this:
>
> https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Basic-PowerPC-Built-in-Functions-Available-on-all-Configurations.html#Basic-PowerPC-Built-in-Functions-Available-on-all-Configurations
> .
> Similarly for x86.
>
> This is the mechanism that the dynamic linker / runtime use to select
> the best implementation (of for example memcpy or cosf128) based on
> the platform (POWER8 vs POWER9).
>
> This CAN be used by any project/package. I am working on documenting
> this (implementing dynamic IFUNC selection for platform specific
> optimizations) as part of the PVECLIB project. If interested send me a
> note.
>

This explanation helps quite a bit. Thanks. However, I'm not really a
programmer. My "developer" duties for the varios freetel projects are
largely limited to creating/managing the CMake build system.

As far as LCPNet itself I've communicated with the primary developer quite
a bit over the last week. LPCNet *will not work* without optimizations (at
least not in real time which is the point).

I think we're going to work towards a static build option, we just have to
figure out the mechanics. The main application does check for AVX/AVX2 and
disables the 2020 mode if it's not available.

Thanks,
Richard
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-24 Thread Steven Munroe
There really two questions here. Will/can  Fedora automatically take
advantage of new instruction set architecture (ISA). And can my
project take advantage to new ISA.

The short answer is yes/maybe.

The longer answer is the mechanisms exist (built into the compilers
and runtime), but not all projects and packages take advantage of
these mechanisms (For example dynamic libraries can use platform based
library search or STTGNUIFUNC symbol type extension), and. The Kernel
and GLIBC usually do. This requires the platform maintainers to do
extra work. I don't think either is commonly used for general library
packages.

The distro/kernel/runtime/community will decide on a "base platform
ISA level" For PowerPC64le that is POWER8 (PowerISA-2.07B). This
includes VMX/VSX but not the POWER9 extensions like Float128 hardware
FP.
See: https://fedoraproject.org/wiki/Architectures/PowerPC

Of course the the distro (will/should) install on newer processors
(like POWER9) which has a larger ISA (more instructions).

The Linux kernel notified each process of the Platform and Hardware
Capabilities it the AUX Vector (Defined in the Application Binary
Interface Document for each platform). The compilers provide a easy to
use interface to interrogate this:
https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Basic-PowerPC-Built-in-Functions-Available-on-all-Configurations.html#Basic-PowerPC-Built-in-Functions-Available-on-all-Configurations.
Similarly for x86.

This is the mechanism that the dynamic linker / runtime use to select
the best implementation (of for example memcpy or cosf128) based on
the platform (POWER8 vs POWER9).

This CAN be used by any project/package. I am working on documenting
this (implementing dynamic IFUNC selection for platform specific
optimizations) as part of the PVECLIB project. If interested send me a
note.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Sam Varshavchik

Richard Shaw writes:

I'm having trouble determining what the base CPU targets for Fedora can  
accommodate.



For example, ss it safe to assume AVX2 on x86_64? 


Nope.

Linux thinkpad 5.5.16-200.fc31.x86_64 #1 SMP Wed Apr 8 16:43:33 UTC 2020  
x86_64 x86_64 x86_64 GNU/Linux

[
/proc/cpuinfo:

flags   : ... avx ... 

Last year a proposal was floated to require avx2 for Fedora 3-something. It  
didn't go well.




pgptHZAbRBITt.pgp
Description: PGP signature
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Richard Shaw
On Wed, Apr 22, 2020 at 10:42 AM Richard W.M. Jones 
wrote:

> On Wed, Apr 22, 2020 at 10:19:47AM -0500, Richard Shaw wrote:
> > I'm not sure all upstream have the manpower or knowledge to do it
> > dynamically, plus in the case of LPCNet, it's required as it's the whole
> > point of the project.
>
> Surely there must be something which decides whether or not to use
> LPCNet?  It sounds like this pushes the decision / fallback path to
> some other component.
>

The fork of LPCNet specifical being developed for ham radio digital
communications will build without optimizations but warns that the code
will be very slow. FreeDV (already in Fedora) is getting a new "mode"
which requires LPCNet to work.

So without the glibc implementation I'm still "stuck".

Thanks,
Richard
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Richard W.M. Jones
On Wed, Apr 22, 2020 at 10:19:47AM -0500, Richard Shaw wrote:
> On Wed, Apr 22, 2020 at 10:13 AM Michael Cronenworth 
> wrote:
> 
> > On 4/22/20 9:51 AM, Richard Shaw wrote:
> > > So it seems what's really needed is a method to support software with
> > > optimizations above the baseline without leaving other people behind.
> >
> > There are varying software that do this. Glibc is one. FFmpeg is another.
> > There's
> > also a library that could return supported features, but I'm not sure if
> > this will
> > work for your use-case.
> >
> 
> I'm not sure all upstream have the manpower or knowledge to do it
> dynamically, plus in the case of LPCNet, it's required as it's the whole
> point of the project.

Surely there must be something which decides whether or not to use
LPCNet?  It sounds like this pushes the decision / fallback path to
some other component.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Florian Weimer
* Dominik Mierzejewski:

> If upstream doesn't support runtime selection of SIMD-optimized code
> paths based on CPU capabilities, you can build several versions and
> have glibc handle that by putting each version in a special subdirectory
> of /usr/lib64, e.g.:
> /usr/lib64/haswell
> /usr/lib64/haswell/avx512_1/
>
> See
> https://clearlinux.org/news-blogs/transparent-use-library-packages-optimized-intel-architecture
> for some details.

These are ill-defined unfortuantely and as such not very useful.
For example, -march=haswell does not result in code that can be
placed in a haswell subdirectory.

I'm working with AMD and Intel to come up with a better subdirectory
definition that also matches between GCC and glibc.

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Jerry James
On Wed, Apr 22, 2020 at 9:25 AM Dominik 'Rathann' Mierzejewski
 wrote:
> If upstream doesn't support runtime selection of SIMD-optimized code
> paths based on CPU capabilities, you can build several versions and
> have glibc handle that by putting each version in a special subdirectory
> of /usr/lib64, e.g.:
> /usr/lib64/haswell
> /usr/lib64/haswell/avx512_1/

See https://pagure.io/packaging-committee/issue/926.
-- 
Jerry James
http://www.jamezone.org/
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Dominik 'Rathann' Mierzejewski
On Wednesday, 22 April 2020 at 16:51, Richard Shaw wrote:
> On Wed, Apr 22, 2020 at 9:28 AM Artur Iwicki  wrote:
> 
> > Regarding x86_64 and AVX2, last year we had a very heated discussion about
> > this on the mailing list.
> >
> >
> > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/MPFXH4Y5LDC5L2VXWKUHAX3WAKBQXR4U/#MPFXH4Y5LDC5L2VXWKUHAX3WAKBQXR4U
> >
> > tl;dr: there was a proposal to make "x86_64" in Fedora mean "must support
> > at least AVX2" and it met with a lot of backlash.
> >
> 
> Now that you mention it that does tickle some brain cells...
> 
> So it seems what's really needed is a method to support software with
> optimizations above the baseline without leaving other people behind.
> 
> The only way I can think of to do that that would be to have optional
> "enhanced" repos available that people can enable if their system supports
> it. The hard part would be keeping it in sync with the main repo. It would
> have to be a parallel build process and similar to the current process if
> one fails the build is a NOGO across the board.
> 
> Could we treat them like arches?
>  Something like:
> X86_64 & X86_64-AVX2

If upstream doesn't support runtime selection of SIMD-optimized code
paths based on CPU capabilities, you can build several versions and
have glibc handle that by putting each version in a special subdirectory
of /usr/lib64, e.g.:
/usr/lib64/haswell
/usr/lib64/haswell/avx512_1/

See
https://clearlinux.org/news-blogs/transparent-use-library-packages-optimized-intel-architecture
for some details.

> armv7hf & armv7fh-NEON

There already is armv7hnl (=ARMv7 with NEON) and RPM Fusion makes use of
it, for example. Fedora builders don't support this as far as I know.

Hope that helps.

Regards,
Dominik
-- 
Fedora   https://getfedora.org  |  RPM Fusion  http://rpmfusion.org
There should be a science of discontent. People need hard times and
oppression to develop psychic muscles.
-- from "Collected Sayings of Muad'Dib" by the Princess Irulan
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Richard Shaw
On Wed, Apr 22, 2020 at 10:13 AM Michael Cronenworth 
wrote:

> On 4/22/20 9:51 AM, Richard Shaw wrote:
> > So it seems what's really needed is a method to support software with
> > optimizations above the baseline without leaving other people behind.
>
> There are varying software that do this. Glibc is one. FFmpeg is another.
> There's
> also a library that could return supported features, but I'm not sure if
> this will
> work for your use-case.
>

I'm not sure all upstream have the manpower or knowledge to do it
dynamically, plus in the case of LPCNet, it's required as it's the whole
point of the project.

Thanks,
Richard
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Michael Cronenworth

On 4/22/20 9:51 AM, Richard Shaw wrote:
So it seems what's really needed is a method to support software with 
optimizations above the baseline without leaving other people behind.


There are varying software that do this. Glibc is one. FFmpeg is another. There's 
also a library that could return supported features, but I'm not sure if this will 
work for your use-case.


https://github.com/google/cpu_features
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Richard Shaw
On Wed, Apr 22, 2020 at 9:28 AM Artur Iwicki  wrote:

> Regarding x86_64 and AVX2, last year we had a very heated discussion about
> this on the mailing list.
>
>
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/MPFXH4Y5LDC5L2VXWKUHAX3WAKBQXR4U/#MPFXH4Y5LDC5L2VXWKUHAX3WAKBQXR4U
>
> tl;dr: there was a proposal to make "x86_64" in Fedora mean "must support
> at least AVX2" and it met with a lot of backlash.
>

Now that you mention it that does tickle some brain cells...

So it seems what's really needed is a method to support software with
optimizations above the baseline without leaving other people behind.

The only way I can think of to do that that would be to have optional
"enhanced" repos available that people can enable if their system supports
it. The hard part would be keeping it in sync with the main repo. It would
have to be a parallel build process and similar to the current process if
one fails the build is a NOGO across the board.

Could we treat them like arches?
 Something like:
X86_64 & X86_64-AVX2
armv7hf & armv7fh-NEON
etc...

It would absolutely have to be OPT IN because once you enable it your
system is no longer necessarily portable to other hardware.

Thanks,
Richard
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Artur Iwicki
Regarding x86_64 and AVX2, last year we had a very heated discussion about this 
on the mailing list.

https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/MPFXH4Y5LDC5L2VXWKUHAX3WAKBQXR4U/#MPFXH4Y5LDC5L2VXWKUHAX3WAKBQXR4U

tl;dr: there was a proposal to make "x86_64" in Fedora mean "must support at 
least AVX2" and it met with a lot of backlash.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Richard Shaw
Ok, so to sum up...

I can assume NEON on aarch64 and that's about it.

So how do I handle this from a packaging perspective? Do I build multiple
times and create optimized and non-optimized packages?

 and  -AVX/AVX2/NEON?

I don't want to ExcludeArch as long as it "builds" but without the
extensions i'm not sure real time processing can happen.

Thanks,
Richard
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Florian Weimer
* Peter Robinson:

>> > What about VSX on ppc64le?
>>
>> Yes, but only the parts that are in POWER8.
>
> We on;y support POWER8 and later in Fedora for ppc64le.

Yes, that's what I meant: you cannot assume POWER9 or -mfuture.

(There is nothing earlier than POWER8 for ppc64le anyway, some earlier
machine types were only used for the architecture bringup and do not
implement the ppc64le ISA.)

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Marcin Juszkiewicz
W dniu 22.04.2020 o 14:52, Florian Weimer pisze:
>> At this time upstream only supports AVX/AVX2/NEON, but if they did
>> add support for SVE on aarch64, can I use it?

> No.  I don't think the builders support it yet.

You can assume NEON on aarch64 as it is mandatory part. SVE is in one
or two cpus so far so you can not assume it.

On armhfp you have nothing. Builders have NEON but there were some
armv7 cpus without NEON...
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Peter Robinson
On Wed, Apr 22, 2020 at 1:54 PM Florian Weimer  wrote:
>
> * Richard Shaw:
>
> > I'm working on packaging a project that makes use of AVX/AVX2/NEON to 
> > process and
> > improve the quality of low bitrate human speech:
> >
> > https://github.com/mozilla/LPCNet
> >
> > I'm having trouble determining what the base CPU targets for Fedora can 
> > accommodate.
> >
> > For example, ss it safe to assume AVX2 on x86_64?
>
> No (although the builders support it).
>
> You can assume SSE2, but nothing more recent.
>
> The glibc dynamic loader can select different library versions based on
> the ISA support level, but that this is more of a theory at this point
> because there are no useful, well-documented selection criteria.
>
> > At this time upstream only supports AVX/AVX2/NEON, but if they did add
> > support for SVE on aarch64, can I use it?
>
> No.  I don't think the builders support it yet.
>
> > What about VSX on ppc64le?
>
> Yes, but only the parts that are in POWER8.

We on;y support POWER8 and later in Fedora for ppc64le.

> > It doesn't look like I have any options on s390x since the base is
> > "-march=zEC12+" which from what I can tell doesn't support the
> > extended intrinsics required.
>
> Yes, vectorization is added in z13 (with more extensions in later ISAs).
> Red Hat Enterprise Linux 8 is built for z13, so it may make sense to
> have conditionals that automatically detect this.
>
> Thanks,
> Florian
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct: 
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: What CPU extensions can we assume are available by arch?

2020-04-22 Thread Florian Weimer
* Richard Shaw:

> I'm working on packaging a project that makes use of AVX/AVX2/NEON to process 
> and
> improve the quality of low bitrate human speech:
>
> https://github.com/mozilla/LPCNet
>
> I'm having trouble determining what the base CPU targets for Fedora can 
> accommodate.
>
> For example, ss it safe to assume AVX2 on x86_64?

No (although the builders support it).

You can assume SSE2, but nothing more recent.

The glibc dynamic loader can select different library versions based on
the ISA support level, but that this is more of a theory at this point
because there are no useful, well-documented selection criteria.

> At this time upstream only supports AVX/AVX2/NEON, but if they did add
> support for SVE on aarch64, can I use it?

No.  I don't think the builders support it yet.

> What about VSX on ppc64le?

Yes, but only the parts that are in POWER8.

> It doesn't look like I have any options on s390x since the base is
> "-march=zEC12+" which from what I can tell doesn't support the
> extended intrinsics required.

Yes, vectorization is added in z13 (with more extensions in later ISAs).
Red Hat Enterprise Linux 8 is built for z13, so it may make sense to
have conditionals that automatically detect this.

Thanks,
Florian
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org