Re: Altivec in baseline for ppc64?

2021-07-18 Thread Riccardo Mottola

Hi,

John Paul Adrian Glaubitz wrote:

It depends on the workload, of course. Anything that does SIMD like matrix 
multiplications
in multimedia or scientific computing will, of course, profit from enabling 
AltiVec.

No one claimed that AltiVec, MMX or SSE will just improve everything.


Exactly, let's clarify something out direct experience. I don't know if 
things improved recently, but essentiallty GCC some years ago did not do 
real auto-vectoring. Enabling AltiVec essentially does nothing, it 
allows you to write code to use it, but it will not magically turn code 
faster.
For specific libraries and applications which do support AltiVec (e.g. 
many video-decoding or image processing) improvements may be great. 
Especially on "older" processors like G4 or G5.
On intel instead code may be faster with MMX, SSEx because additional 
resources and registers are opened up (on 32bit).


If also AltiVec can now benefit of auto-vectoring I'm interested to 
know, I have myself written imaging code and never noticed improvement 
in AV just by turning it on, but did not retest with recent GCC.



I would argue that there are far more users with PowerMacs which all support 
AltiVecs than
with obscure embedded machines. As I said, some packages like nss2 already 
enabled AltiVec
by default.


As far as PowerMacs you are completely right, but PPC is used more and 
more on "other" systems too, boards done with the available NXP 
processors and several are based on embedded CPUs, e.g. e5500 cores and 
variations.

Our upcoming PPC Laptop will have AltiVec, but other not.

Riccardo



Re: Altivec in baseline for ppc64?

2021-07-18 Thread Luke Kenneth Casson Leighton
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Sun, Jul 18, 2021 at 4:17 AM Jeffrey Walton  wrote:

> I never really took notice that IBM captured the projects. But with
> your lens it sounds about right.
>
> ("Captured", as in regulatory capture as seen in the US. Regulatory
> capture is where private industry gets so cozy with government and
> regulators that industry writes their own rules and gov is just an
> extension of a few dominant players).

please do forgive my frustration at this particular strategic mistake:
i don't believe IBM intended for this to happen, it's more that there
simply wasn't anyone else around at the time to say "if you do
make this innocent-looking decision because it improves performance
for the *currently* only-existing hardware for the past 10 years - yours -
it's going to have consequences".


> Forgive my ignorance... Once traces of companies like NXP and IBM are
> removed, and Altivec is removed, what is left?

what has now been termed the "Scalar Fixed Point Compliancy Subset"
and the "Scalar Floating Point Compliancy Subset".  120 and 200
instructions or so, respectively.  they're in the v3.0C spec (1st few
pages) https://ftp.libre-soc.org/PowerISA_public.v3.0C.pdf and Henriok
and i updated the wikipedia page on Power ISA recently.
https://en.wikipedia.org/wiki/Power_ISA#Compliancy

> Are some pieces of the ISA going to be (re)used?

yes, the entire scalar floating-point subset.  HOWEVER...
unnnforrrtunately, again: the assumption was "you'll always
have VSX", and as you're no doubt aware, FP <-> INT
conversion in Power ISA *used* to have move instructions
between FP and INT... but with VSX *assuming even integers
can be done in SIMD registers* those scalar FP<->INT MV
instructions were REMOVED somewhere around v2.06

so... we have to put in an RFP to put them back in, and at
the same time we intend to take the opportunity to propose
adding different *types* of conversion (java, javascript),
as well as "FP Load Immediate" using BF16 as the constant

https://libre-soc.org/openpower/sv/int_fp_mv/

> Or is the ISA being built from scratch?

hell no, that would be insane, you're no doubt aware how long
it took OpenRISC 1200 to fully mature.

in a nutshell, we're taking the scalar parts of the Power ISA, and
creating a "REP-like" repeater (with Vector context).  this multiplies
the base 200-ish instructions by between 1,000 and 8,000 to give
a total number of intrinsics somewhere well north of a quarter of
a million "options".  if you then bring in 3D Swizzle and other
advanced features that becomes several million.

all *WITHOUT* actually adding one single explicit Vector instruction
(as is normally done for Vector ISAs).

> Is it even PowerPC anymore?

yes - just very advanced.  or... more like: revival of 30+ year old
concepts and strapping JATO rockets to them.

> Is it just a RISC machine?

that's up to hardware micro-architects.  i'm designing Power ISA
extensions

> (PowerPC scatter/gather is quite lame, even in the latest ISA 3.0B, so
> I'm not sure there's much good stuff to reuse).

SVP64 is... comprehensive. https://libre-soc.org/openpower/sv/

but, bear in mind: SVP64 is micro-architecturally agnostic.  even an
Embedded system could implement some levels of it and get reduced
program size and reduced power consumption, rather than leverage
it for high performance.

> Anyway, I'm excited to see a SIMD alternative that could become
> mainstream.

Cray-style Vectors are seriously misunderstood. i cannot begin to
express how badly the entire computing industry has f* up by
ignoring them. AVX-512 is a wake-up call: ARM SVE is a half-way
step in the right direction.

even the wikipedia page explicitly stated (until last month),
"Cray vectors are only good for numbercrunching and SIMD by
comparison is superior in every way" which is just... blatantly
and demonstrably false even with a basic hand-drawn timing diagram
https://en.wikipedia.org/wiki/Vector_processor#/media/File:Simd_vs_vector.png
explained here:
https://youtu.be/gexy0z1YqFY?t=676

> I really look forward to the new hardware making it to
> market.

appreciated.

l.



Re: Altivec in baseline for ppc64?

2021-07-17 Thread Jeffrey Walton
> On Fri, Jul 16, 2021 at 12:09 PM John Paul Adrian Glaubitz
>  wrote:
> >
> > On 7/16/21 12:59 PM, Jeffrey Walton wrote:
> > > Does it have to be one or the other? Can't you have both?
> >
> > Well, you could have runtime detection like certain multimedia codes and 
> > OpenSSL use but
> > most packages don't do that.
>
> this is the "sane" way to do it.  unfortunately, the EABIv2, which
> *explicitly* states, "SIMD is mandatory" is resulting in an inexorable
> creep of submissions from IBM developers (to libc6 and other
> libraries)

I never really took notice that IBM captured the projects. But with
your lens it sounds about right.

("Captured", as in regulatory capture as seen in the US. Regulatory
capture is where private industry gets so cozy with government and
regulators that industry writes their own rules and gov is just an
extension of a few dominant players).

> with Quad and 8 1.5+ ghz on the roadmap over the next 3 years,
> Libre-SOC's processor is *not* intended for just "embedded" uses.
> we've simply made it abundantly clear that Hell Will Freeze Over
> before we add a suicidal *700* SIMD instructions to what is supposed
> to be a RISC design.

Forgive my ignorance... Once traces of companies like NXP and IBM are
removed, and Altivec is removed, what is left? Are some pieces of the
ISA going to be (re)used? Or is the ISA being built from scratch? Is
it even PowerPC anymore? Is it just a RISC machine?

(PowerPC scatter/gather is quite lame, even in the latest ISA 3.0B, so
I'm not sure there's much good stuff to reuse).

Anyway, I'm excited to see a SIMD alternative that could become
mainstream. I really look forward to the new hardware making it to
market.

Jeff



Re: Altivec in baseline for ppc64?

2021-07-17 Thread Luke Kenneth Casson Leighton
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Fri, Jul 16, 2021 at 12:09 PM John Paul Adrian Glaubitz
 wrote:
>
> On 7/16/21 12:59 PM, Jeffrey Walton wrote:
> > Does it have to be one or the other? Can't you have both?
>
> Well, you could have runtime detection like certain multimedia codes and 
> OpenSSL use but
> most packages don't do that.

this is the "sane" way to do it.  unfortunately, the EABIv2, which
*explicitly* states, "SIMD is mandatory" is resulting in an inexorable
creep of submissions from IBM developers (to libc6 and other
libraries)

with Quad and 8 1.5+ ghz on the roadmap over the next 3 years,
Libre-SOC's processor is *not* intended for just "embedded" uses.
we've simply made it abundantly clear that Hell Will Freeze Over
before we add a suicidal *700* SIMD instructions to what is supposed
to be a RISC design.

l.



Re: Altivec in baseline for ppc64?

2021-07-16 Thread Luke Kenneth Casson Leighton
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Fri, Jul 16, 2021 at 2:55 PM Lennart Sorensen
 wrote:

> Some years ago I tried to get such runtime detection added for vlc.
> Upstream was certainly far from helpful and just about hostile to the
> idea of adding runtime detection code on powerpc.

because "why would you support 10+ year old processors", i do get it,
but it is very frustrating.

>  Given I was just
> trying to help out the people hitting a problem with it (I don't have
> any desktop powerpc, I only deal with embedded), I decided it wasn't
> worth my effort to argue with them.

Libre-SOC's SVP64 ISA is a stunning 4 to 5 times less instructions for MP3
CODEC inner loop.  i'm also adding hardware-level support for in-place
FFT butterfly, such that *any* sized FFT should be possible to do in around
45 instructions.

when hardware's available this _should_ shock ffmpeg and other
developers out of rejecting upstream patches for non-IBM non-SIMD
contributions.

... but the more mandatory patches "#ifdef POWER9" include VSX,
the harder that becomes to do, because our processor will be POWER9
compliant... just without SIMD.

l.



Re: Altivec in baseline for ppc64?

2021-07-16 Thread Lennart Sorensen
On Fri, Jul 16, 2021 at 01:08:47PM +0200, John Paul Adrian Glaubitz wrote:
> Well, you could have runtime detection like certain multimedia codes and 
> OpenSSL use but
> most packages don't do that.
> 
> Either way, if certain downstreams want better support for certain targets, 
> they are always
> welcome to jump in and send patches to me or upstream. This port is a 
> community effort,
> after all.

Some years ago I tried to get such runtime detection added for vlc.
Upstream was certainly far from helpful and just about hostile to the
idea of adding runtime detection code on powerpc.  Given I was just
trying to help out the people hitting a problem with it (I don't have
any desktop powerpc, I only deal with embedded), I decided it wasn't
worth my effort to argue with them.

-- 
Len Sorensen



Re: Altivec in baseline for ppc64?

2021-07-16 Thread John Paul Adrian Glaubitz
On 7/16/21 12:59 PM, Jeffrey Walton wrote:
>> I would argue that there are far more users with PowerMacs which all support 
>> AltiVecs than
>> with obscure embedded machines. As I said, some packages like nss2 already 
>> enabled AltiVec
>> by default.
> 
> Does it have to be one or the other? Can't you have both?

Well, you could have runtime detection like certain multimedia codes and 
OpenSSL use but
most packages don't do that.

Either way, if certain downstreams want better support for certain targets, 
they are always
welcome to jump in and send patches to me or upstream. This port is a community 
effort,
after all.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: Altivec in baseline for ppc64?

2021-07-16 Thread Jeffrey Walton
On Fri, Jul 16, 2021 at 2:59 AM John Paul Adrian Glaubitz
 wrote:
>
> On 7/15/21 5:49 PM, Christian Zigotzky wrote:
> > I disagree too because the performance of software with AltiVec support 
> > isn't as
> > high as expected. I tested it a lot because we have AltiVec and Non-Altivec 
> > machines
> > here. We changed to Non-AltiVec compiled software a while ago.
>
> It depends on the workload, of course. Anything that does SIMD like matrix 
> multiplications
> in multimedia or scientific computing will, of course, profit from enabling 
> AltiVec.
>
> No one claimed that AltiVec, MMX or SSE will just improve everything.

+1

I've found a few algorithms that were profitable on ARM and x86, but
not profitable with Altivec. The LEA algorithm comes to mind:
https://github.com/weidai11/cryptopp/blob/master/lea_simd.cpp#L46.
Altivec runs about 5x slower than C++.

> > We and the MintPPC team had some problems with the VLC media player on our 
> > Non-AltiVec
> > machines a year ago because it is only available in an AltiVec version.
> > The MintPPC team had to recompile it. [1]
> > We also had to recompile the version 3.0.12 of VLC because of the AltiVec 
> > dependency again. [2]
>
> I would argue that there are far more users with PowerMacs which all support 
> AltiVecs than
> with obscure embedded machines. As I said, some packages like nss2 already 
> enabled AltiVec
> by default.

Does it have to be one or the other? Can't you have both?

Jeff



Re: Altivec in baseline for ppc64?

2021-07-16 Thread John Paul Adrian Glaubitz
On 7/15/21 5:49 PM, Christian Zigotzky wrote:
> I disagree too because the performance of software with AltiVec support isn't 
> as
> high as expected. I tested it a lot because we have AltiVec and Non-Altivec 
> machines
> here. We changed to Non-AltiVec compiled software a while ago.

It depends on the workload, of course. Anything that does SIMD like matrix 
multiplications
in multimedia or scientific computing will, of course, profit from enabling 
AltiVec.

No one claimed that AltiVec, MMX or SSE will just improve everything.

> We and the MintPPC team had some problems with the VLC media player on our 
> Non-AltiVec
> machines a year ago because it is only available in an AltiVec version.
> The MintPPC team had to recompile it. [1]
> We also had to recompile the version 3.0.12 of VLC because of the AltiVec 
> dependency again. [2]

I would argue that there are far more users with PowerMacs which all support 
AltiVecs than
with obscure embedded machines. As I said, some packages like nss2 already 
enabled AltiVec
by default.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: Altivec in baseline for ppc64?

2021-07-15 Thread Luke Kenneth Casson Leighton
On 7/13/21, John Paul Adrian Glaubitz  wrote:

> I wasn't really a fan of that change but my stance is that we should use
> AltiVec
> in packages where it makes sense as the majority of the ppc64 port users
> will
> have a machine that suppport AltiVec.


please *do not* do this.

we are designing a modern Libre/Open CPU which will not punish
developers or ourselves, stabbing ourselves in the head with 950
instructions.

Power ISA is supposed to be RISC.

the only reason SIMD was made mandatory in EABI v2 was because there
was nobody making hardware other than IBM to object to what is
becoming recognised as an extremely serious and costly mistake for the
future of the OpenPOWER ecosystem.

if people continue to assume that SIMD is acceptable just because the
only current hardware is from IBM it only makes it harder and more and
more costly to unravel the f***up and widens and already yawning
barrier to entry for new implementations of OpenPOWER.

PLEASE, really, for god's sake, please do NOT recommend people
propagate the SIMD paradigm.

https://www.sigarch.org/simd-instructions-considered-harmful/

l.



Re: Altivec in baseline for ppc64?

2021-07-15 Thread Christian Zigotzky

On 15 July 2021 at 2:04 pm, Sébastien Villemot wrote:

Le mardi 13 juillet 2021 à 21:04 +0200, John Paul Adrian Glaubitz a
écrit :


Please go ahead and enabled AltiVec as I don't think it makes much sense to use 
BLAS
on machines without any SIMD support. If any user complains about compatibility 
issues,
please feel free to bring up the issue here again.

I think I disagree with this idea. OpenBLAS can be pulled in by chains
of dependencies, even for users who do not even know what BLAS is.
Violating the baseline can lead to hard-to-understand crashes.
Since I think that reliability is more important than performance, I
prefer to strictly respect the baseline in the binary package.


Hi Sébastien,

I disagree too because the performance of software with AltiVec support 
isn't as high as expected. I tested it a lot because we have AltiVec and 
Non-Altivec machines here. We changed to Non-AltiVec compiled software a 
while ago.


We and the MintPPC team had some problems with the VLC media player on 
our Non-AltiVec machines a year ago because it is only available in an 
AltiVec version.

The MintPPC team had to recompile it. [1]
We also had to recompile the version 3.0.12 of VLC because of the 
AltiVec dependency again. [2]


Cheers,
Christian

A-EON Technology Core Linux support team
http://www.a-eon.com
A-EON Technology Ltd
Asquith House
Unit 1 Dyfrig Rd
Cardiff CF5 5AD
United Kingdom

[1] https://forum.hyperion-entertainment.com/viewtopic.php?p=51952#p51952
[2] https://forum.hyperion-entertainment.com/viewtopic.php?p=52326#p52326



Re: Altivec in baseline for ppc64?

2021-07-15 Thread John Paul Adrian Glaubitz
On 7/15/21 2:04 PM, Sébastien Villemot wrote:
>> Please go ahead and enabled AltiVec as I don't think it makes much sense to 
>> use BLAS
>> on machines without any SIMD support. If any user complains about 
>> compatibility issues,
>> please feel free to bring up the issue here again.
> 
> I think I disagree with this idea. OpenBLAS can be pulled in by chains
> of dependencies, even for users who do not even know what BLAS is.
> Violating the baseline can lead to hard-to-understand crashes.

True, but all build servers we have support Altivec.

> Since I think that reliability is more important than performance, I 
> prefer to strictly respect the baseline in the binary package.

Sure. But we could also just have observed whether any people report crashes.

> However note that locally recompiling OpenBLAS is a supported and
> documented procedure, for those who want to take full advantage of
> their hardware.
> 
> Regarding the kernel that is currently built in the official binary, I
> could do with some help to determine which one is the best. You can see
> the list of kernels at this address:
> https://salsa.debian.org/science-team/openblas/-/tree/master/kernel/power
> Each KERNEL.* file lists a bunch of source files, many of which are
> assembly files.
> Currently, I use POWER4 for ppc64 and PPCG4 for powerpc, but I’m unsure
> that those are the right choice. I want a kernel that respects the
> baseline, but still taking advantage of all that is in the baseline.

I would have to look into that with more detail. We can probably also
discuss this on the #debian-ports IRC channel.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: Altivec in baseline for ppc64?

2021-07-15 Thread Sébastien Villemot
Le mardi 13 juillet 2021 à 21:04 +0200, John Paul Adrian Glaubitz a
écrit :
> On 7/13/21 1:55 PM, Sébastien Villemot wrote:
> > The wiki page that synthesizes architecture specificities indicates
> > that Altivec is included in the baseline for the ppc64 port:
> > https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
> > 
> > However my understanding is that this port supports any powerpc64 CPU,
> > including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
> > also what the main wiki page for PPC64 says:
> > https://wiki.debian.org/PPC64
> > 
> > Can someone please clarify the situation?
> 
> The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970,
> (64-Bit PowerMac). However, the previous port maintainer decided he wanted to
> support embedded systems such as the PowerPC E5500 which does not support
> AltiVec.

Thanks for this clarification. I have updated the architectures wiki
page accordingly.

> > (I’m asking because I’m the maintainer of the openblas package, and
> > knowing whether Altivec is available or not, and more generally what is
> > in the baseline, is essential for proper packaging).
> 
> Please go ahead and enabled AltiVec as I don't think it makes much sense to 
> use BLAS
> on machines without any SIMD support. If any user complains about 
> compatibility issues,
> please feel free to bring up the issue here again.

I think I disagree with this idea. OpenBLAS can be pulled in by chains
of dependencies, even for users who do not even know what BLAS is.
Violating the baseline can lead to hard-to-understand crashes.
Since I think that reliability is more important than performance, I 
prefer to strictly respect the baseline in the binary package.

However note that locally recompiling OpenBLAS is a supported and
documented procedure, for those who want to take full advantage of
their hardware.

Regarding the kernel that is currently built in the official binary, I
could do with some help to determine which one is the best. You can see
the list of kernels at this address:
https://salsa.debian.org/science-team/openblas/-/tree/master/kernel/power
Each KERNEL.* file lists a bunch of source files, many of which are
assembly files.
Currently, I use POWER4 for ppc64 and PPCG4 for powerpc, but I’m unsure
that those are the right choice. I want a kernel that respects the
baseline, but still taking advantage of all that is in the baseline.

-- 
⢀⣴⠾⠻⢶⣦⠀  Sébastien Villemot
⣾⠁⢠⠒⠀⣿⡁  Debian Developer
⢿⡄⠘⠷⠚⠋⠀  https://sebastien.villemot.name
⠈⠳⣄  https://www.debian.org



signature.asc
Description: This is a digitally signed message part


Re: Altivec in baseline for ppc64?

2021-07-13 Thread William Bonnet
Hi


On 13/07/2021 21:35, Karoly Balogh wrote:
> Just to note it here, this CPU family is also used by some relatively
> recent desktop-class PowerPC machines, like A-Eon's AmigaOne X5000 (and
> its "CyrusPlus" motherboard). So it's not just about the support of some
> obscure embedded dev board.


e5500 is also used by NXP in their T1040 QorIQ network appliances I
recompiled and ported most of Debian 8 packages to these boxes about
three to four years ago. It went pretty smoth even if in some cases i
had to patch some upstream source code because of  inline ASM code or
hardcoded compiler options in Makefiles. Hope fully it was very few
software we were not using in network use case.

Initial build hase been done using 5 dual cpu Powermac G5.   Once it
hase been proven working, and build chain boot strapped i recompiled a
few thousands of packages using a Power 8 installed with the packages
built on Macs


So in some cases like mine it can be not about embeded boards or Amiga
clones :) and i am still an Amiga lover and owner 30 years after :P


Any ways i'll be happy to share and talk about this any days.


Cheers

W.



-- 

kind regards,
William   https://forum.armwizard.org

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁   wbonnet@(armwizard|firmwaretoolkit|neuralnet-studio).org
 ⢿⡄⠘⠷⠚⠋⠀ GPG fingerprint: 7189 DC8E 15B9 B3E4 EA3E 902B 8EAC F0B9 25A5 9D48
 ⠈⠳⣄




OpenPGP_signature
Description: OpenPGP digital signature


Re: Altivec in baseline for ppc64?

2021-07-13 Thread Christian Zigotzky



> On 13. Jul 2021, at 21:44, Karoly Balogh  wrote:
> 
> Hi,
> 
> On Tue, 13 Jul 2021, John Paul Adrian Glaubitz wrote:
> 
>>> However my understanding is that this port supports any powerpc64 CPU,
>>> including some that don’t have Altivec (e.g. POWER4 or POWER5). This
>>> is also what the main wiki page for PPC64 says:
>>> https://wiki.debian.org/PPC64
>>> 
>>> Can someone please clarify the situation?
>> 
>> The ppc64 originally used the ppc64 baseline including AltiVec e.g
>> PowerPC970, (64-Bit PowerMac). However, the previous port maintainer
>> decided he wanted to support embedded systems such as the PowerPC E5500
>> which does not support AltiVec.
> 
> Just to note it here, this CPU family is also used by some relatively
> recent desktop-class PowerPC machines, like A-Eon's AmigaOne X5000 (and
> its "CyrusPlus" motherboard). So it's not just about the support of some
> obscure embedded dev board.
> 
> Charlie

Charlie,

I didn’t see that you have already posted a note because of the X5000. Thanks 
for the hint.

Cheers,
Christian


Re: Altivec in baseline for ppc64?

2021-07-13 Thread Christian Zigotzky



> On 13. Jul 2021, at 21:05, John Paul Adrian Glaubitz 
>  wrote:
> 
> Hi Sébastien!
> 
>> On 7/13/21 1:55 PM, Sébastien Villemot wrote:
>> The wiki page that synthesizes architecture specificities indicates
>> that Altivec is included in the baseline for the ppc64 port:
>> https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
>> 
>> However my understanding is that this port supports any powerpc64 CPU,
>> including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
>> also what the main wiki page for PPC64 says:
>> https://wiki.debian.org/PPC64
>> 
>> Can someone please clarify the situation?
> 
> The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970,
> (64-Bit PowerMac). However, the previous port maintainer decided he wanted to
> support embedded systems such as the PowerPC E5500 which does not support
> AltiVec.

This was the right decision. Some users use Debian PPC64 on their AmigaOne 
X5000 machines and these machines don’t have AltiVec.


Re: Altivec in baseline for ppc64?

2021-07-13 Thread Jeffrey Walton
On Tue, Jul 13, 2021 at 2:20 PM Sébastien Villemot  wrote:
>
> Le mardi 13 juillet 2021 à 20:06 +0200, Mathieu Malaterre a écrit :
> > On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot  
> > wrote:
> > > Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :
> > > >
> > > > On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot 
> > > >  wrote:
> > > > >
> > > > > The wiki page that synthesizes architecture specificities indicates
> > > > > that Altivec is included in the baseline for the ppc64 port:
> > > > > https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
> > > > >
> > > > > However my understanding is that this port supports any powerpc64 CPU,
> > > > > including some that don’t have Altivec (e.g. POWER4 or POWER5). This 
> > > > > is
> > > > > also what the main wiki page for PPC64 says:
> > > > > https://wiki.debian.org/PPC64
> > > > >
> > > > > Can someone please clarify the situation?
> > > > >
> > > > > (I’m asking because I’m the maintainer of the openblas package, and
> > > > > knowing whether Altivec is available or not, and more generally what 
> > > > > is
> > > > > in the baseline, is essential for proper packaging).
> > > >
> > > > I do not believe that you can do much as a packager. You cannot assume
> > > > anything on the target arch. You need to do the same thing as ffmpeg
> > > > is doing for avx2/sse4 on amd64, you need to do runtime detection. So
> > > > unless upstream is doing something very clever you cannot compile blas
> > > > using any of the fancy altivec instructions :(
> > > >
> > > > The man page for ld.so mentions something about optimized libraries
> > > > (search for "/usr/lib/sse2/"), but this is currently not in use in
> > > > Debian (AFAIK).
> > >
> > > Actually OpenBLAS has its own runtime detection mechanism, which is
> > > used to select the best linear algebra kernel for the current CPU
> > > (those kernels are mainly written in assembly, and take advantage of
> > > available ISA extensions). This mechanism is used on several archs,
> > > including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
> > > a POWER9 kernel; there is even a POWER10 kernel already available).
> > >
> > > However, I cannot enable this mechanism on ppc64 and powerpc, because
> > > the runtime detection only works for POWER6 and above, and my
> > > understanding is that for these two ports the baseline is lower. Hence
> > > on these two archs, only one kernel is included in the package binaries
> > > (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal
> > > performance, users should recompile OpenBLAS locally (as indicated in
> > > the package description and in README.Debian).
> >
> > There are plenty of people on this mailing list that could test/verify
> > that. Is there a quick way to check that your openblas package is
> > compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you
> > do any experiment on perotto.debian.net ?
>
> perotto.debian.net is POWER8, so it’s clearly well above the baseline.
> The package runs fine there, but that does not tell anything about
> baseline violation.
>
> Verifying that the package compiled fine and passed its testsuite on
> build daemons does not give any information about baseline violation
> either, because buildds are probably above the baseline as well. FYI,
> the most recent build logs are there:
> https://buildd.debian.org/status/package.php?p=openblas=experimental
> (there is a problem with powerpc in experimental; but the version in
> sid compiled).
>
> If nobody has the relevant knowledge, then the only option is to test
> the package on the oldest possible hardware. The easiest way to test it
> is to recompile it locally (since this will exercise the testsuite).

I can provide SSH access to a PowerMac G5 with Altivec. That should
test the delineation between Altivec and PWR{5-10}.

If OpenBLAS needs to do 64-bit math, then I have the routines cribbed
away that performs 64-bit addition and subtraction using 32x4 vectors.
The routines have to handle carry/borrow themselves. My experience
with Crypto++ and algos like ChaCha20 demonstrate it is profitable.

Send over your SSH public key/authorized_keys, if interested.

Jeff



Re: Altivec in baseline for ppc64?

2021-07-13 Thread Karoly Balogh
Hi,

On Tue, 13 Jul 2021, John Paul Adrian Glaubitz wrote:

> > However my understanding is that this port supports any powerpc64 CPU,
> > including some that don’t have Altivec (e.g. POWER4 or POWER5). This
> > is also what the main wiki page for PPC64 says:
> > https://wiki.debian.org/PPC64
> >
> > Can someone please clarify the situation?
>
> The ppc64 originally used the ppc64 baseline including AltiVec e.g
> PowerPC970, (64-Bit PowerMac). However, the previous port maintainer
> decided he wanted to support embedded systems such as the PowerPC E5500
> which does not support AltiVec.

Just to note it here, this CPU family is also used by some relatively
recent desktop-class PowerPC machines, like A-Eon's AmigaOne X5000 (and
its "CyrusPlus" motherboard). So it's not just about the support of some
obscure embedded dev board.

Charlie

Re: Altivec in baseline for ppc64?

2021-07-13 Thread William Bonnet
Hi,


>> I am however not sure that my current choices for the ppc64 and powerpc
>> baselines are optimal, hence this thread.


If i can help you i'll be happy to. I have several Powermac G5 running
current Debian i can use one to test your packages as long as you tell
me please a little about the software to test (i don't know OpenBLAS).
On the other hand i am used to build Debian packages on PPC64 and cpu
support issues. an keep on tal talking about this "live on IRC". If
needed we can send technical "conclusions on the list. How about this ? :)

cheers

W.

-- 

kind regards,
William   https://forum.armwizard.org

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁   wbonnet@(armwizard|firmwaretoolkit|neuralnet-studio).org
 ⢿⡄⠘⠷⠚⠋⠀ GPG fingerprint: 7189 DC8E 15B9 B3E4 EA3E 902B 8EAC F0B9 25A5 9D48
 ⠈⠳⣄




OpenPGP_signature
Description: OpenPGP digital signature


Re: Altivec in baseline for ppc64?

2021-07-13 Thread John Paul Adrian Glaubitz
On 7/13/21 6:56 PM, Mathieu Malaterre wrote:
> I do not believe that you can do much as a packager. You cannot assume
> anything on the target arch. You need to do the same thing as ffmpeg
> is doing for avx2/sse4 on amd64, you need to do runtime detection. So
> unless upstream is doing something very clever you cannot compile blas
> using any of the fancy altivec instructions :(
For performance-related libraries I'm okay with AltiVec, we're enabling it for
NSS2 as well and so far no user has reported any issues.

If that changes in the future, i.e. someone is actually complaining, we can
still revisit this issue.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: Altivec in baseline for ppc64?

2021-07-13 Thread John Paul Adrian Glaubitz
Hi Sébastien!

On 7/13/21 1:55 PM, Sébastien Villemot wrote:
> The wiki page that synthesizes architecture specificities indicates
> that Altivec is included in the baseline for the ppc64 port:
> https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
> 
> However my understanding is that this port supports any powerpc64 CPU,
> including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
> also what the main wiki page for PPC64 says:
> https://wiki.debian.org/PPC64
> 
> Can someone please clarify the situation?

The ppc64 originally used the ppc64 baseline including AltiVec e.g PowerPC970,
(64-Bit PowerMac). However, the previous port maintainer decided he wanted to
support embedded systems such as the PowerPC E5500 which does not support
AltiVec.

I wasn't really a fan of that change but my stance is that we should use AltiVec
in packages where it makes sense as the majority of the ppc64 port users will
have a machine that suppport AltiVec.

If they run into an issue with these packages on non-AltiVec systems, they can 
still
file a bug.

> (I’m asking because I’m the maintainer of the openblas package, and
> knowing whether Altivec is available or not, and more generally what is
> in the baseline, is essential for proper packaging).

Please go ahead and enabled AltiVec as I don't think it makes much sense to use 
BLAS
on machines without any SIMD support. If any user complains about compatibility 
issues,
please feel free to bring up the issue here again.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: Altivec in baseline for ppc64?

2021-07-13 Thread Sébastien Villemot
Le mardi 13 juillet 2021 à 20:06 +0200, Mathieu Malaterre a écrit :
> On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot  
> wrote:
> > Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :
> > > 
> > > On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot  
> > > wrote:
> > > > 
> > > > The wiki page that synthesizes architecture specificities indicates
> > > > that Altivec is included in the baseline for the ppc64 port:
> > > > https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
> > > > 
> > > > However my understanding is that this port supports any powerpc64 CPU,
> > > > including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
> > > > also what the main wiki page for PPC64 says:
> > > > https://wiki.debian.org/PPC64
> > > > 
> > > > Can someone please clarify the situation?
> > > > 
> > > > (I’m asking because I’m the maintainer of the openblas package, and
> > > > knowing whether Altivec is available or not, and more generally what is
> > > > in the baseline, is essential for proper packaging).
> > > 
> > > I do not believe that you can do much as a packager. You cannot assume
> > > anything on the target arch. You need to do the same thing as ffmpeg
> > > is doing for avx2/sse4 on amd64, you need to do runtime detection. So
> > > unless upstream is doing something very clever you cannot compile blas
> > > using any of the fancy altivec instructions :(
> > > 
> > > The man page for ld.so mentions something about optimized libraries
> > > (search for "/usr/lib/sse2/"), but this is currently not in use in
> > > Debian (AFAIK).
> > 
> > Actually OpenBLAS has its own runtime detection mechanism, which is
> > used to select the best linear algebra kernel for the current CPU
> > (those kernels are mainly written in assembly, and take advantage of
> > available ISA extensions). This mechanism is used on several archs,
> > including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
> > a POWER9 kernel; there is even a POWER10 kernel already available).
> > 
> > However, I cannot enable this mechanism on ppc64 and powerpc, because
> > the runtime detection only works for POWER6 and above, and my
> > understanding is that for these two ports the baseline is lower. Hence
> > on these two archs, only one kernel is included in the package binaries
> > (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal
> > performance, users should recompile OpenBLAS locally (as indicated in
> > the package description and in README.Debian).
> 
> There are plenty of people on this mailing list that could test/verify
> that. Is there a quick way to check that your openblas package is
> compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you
> do any experiment on perotto.debian.net ?

perotto.debian.net is POWER8, so it’s clearly well above the baseline.
The package runs fine there, but that does not tell anything about
baseline violation.

Verifying that the package compiled fine and passed its testsuite on
build daemons does not give any information about baseline violation
either, because buildds are probably above the baseline as well. FYI,
the most recent build logs are there:
https://buildd.debian.org/status/package.php?p=openblas=experimental
(there is a problem with powerpc in experimental; but the version in
sid compiled).

If nobody has the relevant knowledge, then the only option is to test
the package on the oldest possible hardware. The easiest way to test it
is to recompile it locally (since this will exercise the testsuite).

Note that nobody complained for years about the situation of openblas
on powerpc and ppc64. So maybe that’s a sign that the current setting
is fine (either the baseline is respected, or nobody uses the package).

-- 
⢀⣴⠾⠻⢶⣦⠀  Sébastien Villemot
⣾⠁⢠⠒⠀⣿⡁  Debian Developer
⢿⡄⠘⠷⠚⠋⠀  https://sebastien.villemot.name
⠈⠳⣄  https://www.debian.org



signature.asc
Description: This is a digitally signed message part


Re: Altivec in baseline for ppc64?

2021-07-13 Thread Mathieu Malaterre
On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot  wrote:
>
> Hi Mathieu,
>
> Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :
> >
> > On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot  
> > wrote:
> > >
> > > The wiki page that synthesizes architecture specificities indicates
> > > that Altivec is included in the baseline for the ppc64 port:
> > > https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
> > >
> > > However my understanding is that this port supports any powerpc64 CPU,
> > > including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
> > > also what the main wiki page for PPC64 says:
> > > https://wiki.debian.org/PPC64
> > >
> > > Can someone please clarify the situation?
> > >
> > > (I’m asking because I’m the maintainer of the openblas package, and
> > > knowing whether Altivec is available or not, and more generally what is
> > > in the baseline, is essential for proper packaging).
> >
> > I do not believe that you can do much as a packager. You cannot assume
> > anything on the target arch. You need to do the same thing as ffmpeg
> > is doing for avx2/sse4 on amd64, you need to do runtime detection. So
> > unless upstream is doing something very clever you cannot compile blas
> > using any of the fancy altivec instructions :(
> >
> > The man page for ld.so mentions something about optimized libraries
> > (search for "/usr/lib/sse2/"), but this is currently not in use in
> > Debian (AFAIK).
>
> Actually OpenBLAS has its own runtime detection mechanism, which is
> used to select the best linear algebra kernel for the current CPU
> (those kernels are mainly written in assembly, and take advantage of
> available ISA extensions). This mechanism is used on several archs,
> including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
> a POWER9 kernel; there is even a POWER10 kernel already available).
>
> However, I cannot enable this mechanism on ppc64 and powerpc, because
> the runtime detection only works for POWER6 and above, and my
> understanding is that for these two ports the baseline is lower. Hence
> on these two archs, only one kernel is included in the package binaries
> (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal
> performance, users should recompile OpenBLAS locally (as indicated in
> the package description and in README.Debian).

There are plenty of people on this mailing list that could test/verify
that. Is there a quick way to check that your openblas package is
compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you
do any experiment on perotto.debian.net ?

> I am however not sure that my current choices for the ppc64 and powerpc
> baselines are optimal, hence this thread.
>
> --
> ⢀⣴⠾⠻⢶⣦⠀  Sébastien Villemot
> ⣾⠁⢠⠒⠀⣿⡁  Debian Developer
> ⢿⡄⠘⠷⠚⠋⠀  https://sebastien.villemot.name
> ⠈⠳⣄  https://www.debian.org
>



Re: Altivec in baseline for ppc64?

2021-07-13 Thread Sébastien Villemot
Hi Mathieu,

Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :
> 
> On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot  
> wrote:
> > 
> > The wiki page that synthesizes architecture specificities indicates
> > that Altivec is included in the baseline for the ppc64 port:
> > https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
> > 
> > However my understanding is that this port supports any powerpc64 CPU,
> > including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
> > also what the main wiki page for PPC64 says:
> > https://wiki.debian.org/PPC64
> > 
> > Can someone please clarify the situation?
> > 
> > (I’m asking because I’m the maintainer of the openblas package, and
> > knowing whether Altivec is available or not, and more generally what is
> > in the baseline, is essential for proper packaging).
> 
> I do not believe that you can do much as a packager. You cannot assume
> anything on the target arch. You need to do the same thing as ffmpeg
> is doing for avx2/sse4 on amd64, you need to do runtime detection. So
> unless upstream is doing something very clever you cannot compile blas
> using any of the fancy altivec instructions :(
> 
> The man page for ld.so mentions something about optimized libraries
> (search for "/usr/lib/sse2/"), but this is currently not in use in
> Debian (AFAIK).

Actually OpenBLAS has its own runtime detection mechanism, which is
used to select the best linear algebra kernel for the current CPU
(those kernels are mainly written in assembly, and take advantage of
available ISA extensions). This mechanism is used on several archs,
including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
a POWER9 kernel; there is even a POWER10 kernel already available).

However, I cannot enable this mechanism on ppc64 and powerpc, because
the runtime detection only works for POWER6 and above, and my
understanding is that for these two ports the baseline is lower. Hence
on these two archs, only one kernel is included in the package binaries
(currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal
performance, users should recompile OpenBLAS locally (as indicated in
the package description and in README.Debian).

I am however not sure that my current choices for the ppc64 and powerpc
baselines are optimal, hence this thread.

-- 
⢀⣴⠾⠻⢶⣦⠀  Sébastien Villemot
⣾⠁⢠⠒⠀⣿⡁  Debian Developer
⢿⡄⠘⠷⠚⠋⠀  https://sebastien.villemot.name
⠈⠳⣄  https://www.debian.org



signature.asc
Description: This is a digitally signed message part


Re: Altivec in baseline for ppc64?

2021-07-13 Thread Mathieu Malaterre
Hi Sébastien,

On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot  wrote:
>
> Hi,
>
> The wiki page that synthesizes architecture specificities indicates
> that Altivec is included in the baseline for the ppc64 port:
> https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
>
> However my understanding is that this port supports any powerpc64 CPU,
> including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
> also what the main wiki page for PPC64 says:
> https://wiki.debian.org/PPC64
>
> Can someone please clarify the situation?
>
> (I’m asking because I’m the maintainer of the openblas package, and
> knowing whether Altivec is available or not, and more generally what is
> in the baseline, is essential for proper packaging).

I do not believe that you can do much as a packager. You cannot assume
anything on the target arch. You need to do the same thing as ffmpeg
is doing for avx2/sse4 on amd64, you need to do runtime detection. So
unless upstream is doing something very clever you cannot compile blas
using any of the fancy altivec instructions :(

The man page for ld.so mentions something about optimized libraries
(search for "/usr/lib/sse2/"), but this is currently not in use in
Debian (AFAIK).

2cts