Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-16 Thread Matthew Wilcox
On Mon, Jan 16, 2023 at 10:41:14AM +0100, John Paul Adrian Glaubitz wrote:
> On 1/15/23 01:27, Matthew Wilcox wrote:
> > More useful perhaps is to look at https://popcon.debian.org/
> > 
> > There are three machines reporting popcon results.  It's dead.
> 
> It's an opt-in mechanism that reports 190,000 machines running Debian
> on x86_64. Do you think that there are only 190,000 servers world-wide
> running Debian?

No.  I think there are so few ia64 machines still in operation that the
effort required to continue to support them is now too high relative
to the benefits.  We've dropped support for hardware that still exists
before, and we'll do it again.  The only question is when.

I still have two ia64 machines in my basement.  I've turned one of them
on once since 2009.  And that was because I had a bug I needed to track
down and fix (2018, commits 4b664e739f77 and 2879b65f9de8).


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-16 Thread Ard Biesheuvel
On Mon, 16 Jan 2023 at 10:33, John Paul Adrian Glaubitz
 wrote:
>
> Hi Ard!
>
> On 1/14/23 00:25, Ard Biesheuvel wrote:
> > Thanks for reporting back. I (mis)read the debian ports page [3],
> > which mentions Debian 7 as the highest Debian version that supports
> > IA64, and so I assumed that support had been dropped from Debian.
>
> This page talks about officially supported ports. Debian Ports is an
> unofficial spin maintained by a number of Debian Developers and external
> developers that are volunteering to maintain these ports.
>
> > However, if only a handful of people want to keep this port alive for
> > reasons of nostalgia, it is obviously obsolete, and we should ask
> > ourselves whether it is reasonable to expect Linux contributors to
> > keep spending time on this.
>
> You could say this about a lot of hardware, can't you?
>

Uhm, yes. Linux contributor effort is a scarce resource, and spending
it on architectures that nobody actually uses, such as alpha or ia64,
means it is not spent on things that are useful to more people.

I really do sympathize with the enthusiast/hobbyist PoV - I am also an
engineer that likes to tinker. So 'use' can be defined liberally here,
and cover running the latest Linux on ancient hardware just for
entertainment.

However, the question is not how you or I choose to spend (or waste)
their time. The question is whether it is reasonable *as a community*
to insist that everyone who contributes a cross-architecture change
also has to ensure that obsolete architectures such as i64 or alpha
are not left behind.

The original thread is an interesting example here - removing a
cpu_relax() in cmpxchg() that was only there because of IA64's clunky
SMT implementation. Perhaps this means that IA64 performance is going
to regress substantially for some workloads? Should anyone care?
Should we test such changes first? And how should we do that if there
is no maintainer and nobody has access to the hardware?

The other example is EFI, which i maintain. Should I require from
contributors that they build and boot test EFI changes on ia64 if I
myself don't even have access to the hardware? It is good to know that
things don't seem to be broken today, but if it is going to fall over,
it may take a while before anybody notices. What happens then?

> > Does the Debian ia64 port have any users? Or is the system that builds
> > the packages the only one that consumes them?
>
> There is the popcon statistics. However, that is opt-on and the numbers are
> not really trustworthy. We are getting feedback from time to time from people
> using it.
>
> Is there any problem with the ia64 port at the moment that would justify 
> removal?
>

I would argue that we should mark it obsolete at the very least, so
that it is crystal clear that regressing IA64 (either knowingly or
unknowingly) by a generic or cross-architecture change is not a
showstopper, even at build time. Then, if someone has the skill set
and the time on their hands, as well as access to actual hardware,
they can keep it alive if they want to.


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-16 Thread John Paul Adrian Glaubitz

On 1/15/23 13:04, Sedat Dilek wrote:

Exactly, Debian Popularity Contest was what I was looking for yesterday.

Thanks Matthew.

[1] says in Inst (204701):

Name  || Number  || %
==
binutils-x86-64-linux-gnu || 101548  || 49.61%
binutils-ia64-linux-gnu ||  11  || 0.01%

HELP: Inst. is the number of people who installed this package (sum of
the four categories below)

There may be more popular packages than binutils.
( binutils might tell something about development happening or not. )

Anyway, I am not a popcon expert and never participated in Debian's
Popularity Contest.


And that's the point, it's opt-in!

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-16 Thread John Paul Adrian Glaubitz

On 1/15/23 01:27, Matthew Wilcox wrote:

More useful perhaps is to look at https://popcon.debian.org/

There are three machines reporting popcon results.  It's dead.


It's an opt-in mechanism that reports 190,000 machines running Debian
on x86_64. Do you think that there are only 190,000 servers world-wide
running Debian?

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-16 Thread John Paul Adrian Glaubitz

On 1/14/23 12:28, Sedat Dilek wrote:

[ ... ]


Best is to ask the Debian release-team or (if there exist) maintainers
or responsibles for the IA64 port - which is an ***unofficial*** port.



Here we go:

https://lists.debian.org/debian-ia64/

Posting address: debian-i...@lists.debian.org

Found via 


And the wiki lists Jason Duerstock, Jessica Clarke and me as maintainers:


https://wiki.debian.org/Ports/ia64


Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-16 Thread John Paul Adrian Glaubitz

On 1/14/23 12:24, Sedat Dilek wrote:

Example #1: binutils packages

Checking available binutils package for Debian/unstable IA64 (version:
2.39.90.20230110-1):

https://packages.debian.org/sid/binutils <--- Clearly states IA64 as
"unofficial port"


And?


https://packages.debian.org/sid/ia64/binutils/filelist

Example #2: linux-image packages

Cannot say what this means...

https://packages.debian.org/search?arch=amd64=linux-image
(AMD64 - matches)

https://packages.debian.org/search?arch=ia64=linux-image
(IA64 - no matches)

https://packages.debian.org/search?arch=ia64=linux (IA64 -
matches - but no linux-image which ships normally a bootable
Linux-kernel)


No, the package is called "linux-image-$ARCH-$FLAVOR". There is no "linux-image"
package for amd64 either. Does that prove amd64 is dead?

What you are looking for can be found here:


http://ftp.ports.debian.org/debian-ports/pool-ia64/main/l/linux/



As stated I have no expertise in Debian whatever release for IA64 arch.


Well, maybe let me answer the questions then since I am maintaining the port
in Debian.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-16 Thread John Paul Adrian Glaubitz

Hi Ard!

On 1/14/23 00:25, Ard Biesheuvel wrote:

Thanks for reporting back. I (mis)read the debian ports page [3],
which mentions Debian 7 as the highest Debian version that supports
IA64, and so I assumed that support had been dropped from Debian.


This page talks about officially supported ports. Debian Ports is an
unofficial spin maintained by a number of Debian Developers and external
developers that are volunteering to maintain these ports.


However, if only a handful of people want to keep this port alive for
reasons of nostalgia, it is obviously obsolete, and we should ask
ourselves whether it is reasonable to expect Linux contributors to
keep spending time on this.


You could say this about a lot of hardware, can't you?


Does the Debian ia64 port have any users? Or is the system that builds
the packages the only one that consumes them?


There is the popcon statistics. However, that is opt-on and the numbers are
not really trustworthy. We are getting feedback from time to time from people
using it.

Is there any problem with the ia64 port at the moment that would justify 
removal?

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-15 Thread Sedat Dilek
On Sun, Jan 15, 2023 at 1:27 AM Matthew Wilcox  wrote:
>
> On Sat, Jan 14, 2023 at 12:28:30PM +0100, Sedat Dilek wrote:
> > [ ... ]
> >
> > > Best is to ask the Debian release-team or (if there exist) maintainers
> > > or responsibles for the IA64 port - which is an ***unofficial*** port.
> > >
> >
> > Here we go:
> >
> > https://lists.debian.org/debian-ia64/
> >
> > Posting address: debian-i...@lists.debian.org
> >
> > Found via 
>
> More useful perhaps is to look at https://popcon.debian.org/
>
> There are three machines reporting popcon results.  It's dead.

Exactly, Debian Popularity Contest was what I was looking for yesterday.

Thanks Matthew.

[1] says in Inst (204701):

Name  || Number  || %
==
binutils-x86-64-linux-gnu || 101548  || 49.61%
binutils-ia64-linux-gnu ||  11  || 0.01%

HELP: Inst. is the number of people who installed this package (sum of
the four categories below)

There may be more popular packages than binutils.
( binutils might tell something about development happening or not. )

Anyway, I am not a popcon expert and never participated in Debian's
Popularity Contest.

-Sedat-

[1] https://qa.debian.org/popcon.php?package=binutils


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-14 Thread Matthew Wilcox
On Sat, Jan 14, 2023 at 12:28:30PM +0100, Sedat Dilek wrote:
> [ ... ]
> 
> > Best is to ask the Debian release-team or (if there exist) maintainers
> > or responsibles for the IA64 port - which is an ***unofficial*** port.
> >
> 
> Here we go:
> 
> https://lists.debian.org/debian-ia64/
> 
> Posting address: debian-i...@lists.debian.org
> 
> Found via 

More useful perhaps is to look at https://popcon.debian.org/

There are three machines reporting popcon results.  It's dead.


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-14 Thread Sedat Dilek
[ ... ]

> Best is to ask the Debian release-team or (if there exist) maintainers
> or responsibles for the IA64 port - which is an ***unofficial*** port.
>

Here we go:

https://lists.debian.org/debian-ia64/

Posting address: debian-i...@lists.debian.org

Found via 

-Sedat-


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-14 Thread Sedat Dilek
On Sat, Jan 14, 2023 at 12:43 AM Ard Biesheuvel  wrote:
>
> On Fri, 13 Jan 2023 at 22:06, John Paul Adrian Glaubitz
>  wrote:
> >
> > Hello Ard!
> >
> > > Can I take that as an ack on [0]? The EFI subsystem has evolved
> > > substantially over the years, and there is really no way to do any
> > > IA64 testing beyond build testing, so from that perspective, dropping
> > > it entirely would be welcomed.
> >
> > ia64 is regularly tested in Debian and Gentoo [1][2].
> >
> > Debian's ia64 porterbox yttrium runs a recent kernel without issues:
> >
> > root@yttrium:~# uname -a
> > Linux yttrium 5.19.0-2-mckinley #1 SMP Debian 5.19.11-1 (2022-09-24) ia64 
> > GNU/Linux
> > root@yttrium:~#
> >
> > root@yttrium:~# journalctl -b|head -n10
> > Nov 14 14:46:10 yttrium kernel: Linux version 5.19.0-2-mckinley 
> > (debian-ker...@lists.debian.org) (gcc-11 (Debian 11.3.0-6) 11.3.0, GNU ld 
> > (GNU Binutils for Debian) 2.39) #1 SMP Debian 5.19.11-1 (2022-09-24)
> > Nov 14 14:46:10 yttrium kernel: efi: EFI v2.10 by HP
> > Nov 14 14:46:10 yttrium kernel: efi: SALsystab=0xdfdd63a18 ESI=0xdfdd63f18 
> > ACPI 2.0=0x3d3c4014 HCDP=0xd8798 SMBIOS=0x3d368000
> > Nov 14 14:46:10 yttrium kernel: PCDP: v3 at 0xd8798
> > Nov 14 14:46:10 yttrium kernel: earlycon: uart8250 at I/O port 0x4000 
> > (options '115200n8')
> > Nov 14 14:46:10 yttrium kernel: printk: bootconsole [uart8250] enabled
> > Nov 14 14:46:10 yttrium kernel: ACPI: Early table checksum verification 
> > disabled
> > Nov 14 14:46:10 yttrium kernel: ACPI: RSDP 0x3D3C4014 24 (v02 
> > HP)
> > Nov 14 14:46:10 yttrium kernel: ACPI: XSDT 0x3D3C4580 000124 (v01 
> > HP RX2800-2 0001  0113)
> > Nov 14 14:46:10 yttrium kernel: ACPI: FACP 0x3D3BE000 F4 (v03 
> > HP RX2800-2 0001 HP   0001)
> > root@yttrium:~#
> >
> > Same applies to the buildds:
> >
> > root@lifshitz:~# uname -a
> > Linux lifshitz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 
> > GNU/Linux
> > root@lifshitz:~#
> >
> > root@lenz:~# uname -a
> > Linux lenz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 
> > GNU/Linux
> > root@lenz:~#
> >
> > EFI works fine as well using the latest version of GRUB2.
> >
> > Thanks,
> > Adrian
> >
> > > [1] https://cdimage.debian.org/cdimage/ports/snapshots/
> > > [2] https://mirror.yandex.ru/gentoo-distfiles//releases/ia64/autobuilds/
>
> Thanks for reporting back. I (mis)read the debian ports page [3],
> which mentions Debian 7 as the highest Debian version that supports
> IA64, and so I assumed that support had been dropped from Debian.
>
> However, if only a handful of people want to keep this port alive for
> reasons of nostalgia, it is obviously obsolete, and we should ask
> ourselves whether it is reasonable to expect Linux contributors to
> keep spending time on this.
>
> Does the Debian ia64 port have any users? Or is the system that builds
> the packages the only one that consumes them?
>
>
> [3] https://www.debian.org/ports/ia64/

I have no IA64 hardware or be a user of it or have any strong feelings
to keep this arch in the Linux-kernel.

But I am a Debianist (Debian/unstable AMD64 user).

Best is to ask the Debian release-team or (if there exist) maintainers
or responsibles for the IA64 port - which is an ***unofficial*** port.

What I found... on :

https://cdimage.debian.org/cdimage/ > Ports

https://cdimage.debian.org/cdimage/ports/current/

https://cdimage.debian.org/cdimage/ports/current-debian-installer/ia64/debian-installer-images_20211020_ia64.tar.gz
^^ Last modified: 2021-10-20 22:52

https://cdimage.debian.org/cdimage/ports/current/debian-11.0.0-ia64-NETINST-1.iso
^^ Last modofied: 2022-03-28 14:18

With a net-install image you should be able to setup and explore the
IA64 Debian cosmos.

Example #1: binutils packages

Checking available binutils package for Debian/unstable IA64 (version:
2.39.90.20230110-1):

https://packages.debian.org/sid/binutils <--- Clearly states IA64 as
"unofficial port"
https://packages.debian.org/sid/ia64/binutils/filelist

Example #2: linux-image packages

Cannot say what this means...

https://packages.debian.org/search?arch=amd64=linux-image
(AMD64 - matches)

https://packages.debian.org/search?arch=ia64=linux-image
(IA64 - no matches)

https://packages.debian.org/search?arch=ia64=linux (IA64 -
matches - but no linux-image which ships normally a bootable
Linux-kernel)

As stated I have no expertise in Debian whatever release for IA64 arch.

Hope that helps.

-Sedat-


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Ard Biesheuvel
On Fri, 13 Jan 2023 at 22:06, John Paul Adrian Glaubitz
 wrote:
>
> Hello Ard!
>
> > Can I take that as an ack on [0]? The EFI subsystem has evolved
> > substantially over the years, and there is really no way to do any
> > IA64 testing beyond build testing, so from that perspective, dropping
> > it entirely would be welcomed.
>
> ia64 is regularly tested in Debian and Gentoo [1][2].
>
> Debian's ia64 porterbox yttrium runs a recent kernel without issues:
>
> root@yttrium:~# uname -a
> Linux yttrium 5.19.0-2-mckinley #1 SMP Debian 5.19.11-1 (2022-09-24) ia64 
> GNU/Linux
> root@yttrium:~#
>
> root@yttrium:~# journalctl -b|head -n10
> Nov 14 14:46:10 yttrium kernel: Linux version 5.19.0-2-mckinley 
> (debian-ker...@lists.debian.org) (gcc-11 (Debian 11.3.0-6) 11.3.0, GNU ld 
> (GNU Binutils for Debian) 2.39) #1 SMP Debian 5.19.11-1 (2022-09-24)
> Nov 14 14:46:10 yttrium kernel: efi: EFI v2.10 by HP
> Nov 14 14:46:10 yttrium kernel: efi: SALsystab=0xdfdd63a18 ESI=0xdfdd63f18 
> ACPI 2.0=0x3d3c4014 HCDP=0xd8798 SMBIOS=0x3d368000
> Nov 14 14:46:10 yttrium kernel: PCDP: v3 at 0xd8798
> Nov 14 14:46:10 yttrium kernel: earlycon: uart8250 at I/O port 0x4000 
> (options '115200n8')
> Nov 14 14:46:10 yttrium kernel: printk: bootconsole [uart8250] enabled
> Nov 14 14:46:10 yttrium kernel: ACPI: Early table checksum verification 
> disabled
> Nov 14 14:46:10 yttrium kernel: ACPI: RSDP 0x3D3C4014 24 (v02 HP  
>   )
> Nov 14 14:46:10 yttrium kernel: ACPI: XSDT 0x3D3C4580 000124 (v01 HP  
>RX2800-2 0001  0113)
> Nov 14 14:46:10 yttrium kernel: ACPI: FACP 0x3D3BE000 F4 (v03 HP  
>RX2800-2 0001 HP   0001)
> root@yttrium:~#
>
> Same applies to the buildds:
>
> root@lifshitz:~# uname -a
> Linux lifshitz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 
> GNU/Linux
> root@lifshitz:~#
>
> root@lenz:~# uname -a
> Linux lenz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 GNU/Linux
> root@lenz:~#
>
> EFI works fine as well using the latest version of GRUB2.
>
> Thanks,
> Adrian
>
> > [1] https://cdimage.debian.org/cdimage/ports/snapshots/
> > [2] https://mirror.yandex.ru/gentoo-distfiles//releases/ia64/autobuilds/

Thanks for reporting back. I (mis)read the debian ports page [3],
which mentions Debian 7 as the highest Debian version that supports
IA64, and so I assumed that support had been dropped from Debian.

However, if only a handful of people want to keep this port alive for
reasons of nostalgia, it is obviously obsolete, and we should ask
ourselves whether it is reasonable to expect Linux contributors to
keep spending time on this.

Does the Debian ia64 port have any users? Or is the system that builds
the packages the only one that consumes them?


[3] https://www.debian.org/ports/ia64/


Re: lockref scalability on x86-64 vs cpu_relax

2023-01-13 Thread Mateusz Guzik
On 1/13/23, Linus Torvalds  wrote:
> Side note on your access() changes - if it turns out that you can
> remove all the cred games, we should possibly then revert my old
> commit d7852fbd0f04 ("access: avoid the RCU grace period for the
> temporary subjective credentials") which avoided the biggest issue
> with the unnecessary cred switching.
>
> I *think* access() is the only user of that special 'non_rcu' thing,
> but it is possible that the whole 'non_rcu' thing ends up mattering
> for cases where the cred actually does change because euid != uid (ie
> suid programs), so this would need a bit more effort to do performance
> testing on.
>

I don't think the games are avoidable. For one I found non-root
processes with non-empty cap_effective even on my laptop, albeit I did
not check how often something like this is doing access().

Discussion for another time.

> On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik  wrote:
>> All that said, I think the thing to do here is to replace cpu_relax
>> with a dedicated arch-dependent macro, akin to the following:
>
> I would actually prefer just removing it entirely and see if somebody
> else hollers. You have the numbers to prove it hurts on real hardware,
> and I don't think we have any numbers to the contrary.
>
> So I think it's better to trust the numbers and remove it as a
> failure, than say "let's just remove it on x86-64 and leave everybody
> else with the potentially broken code"
>
[snip]
> Then other architectures can try to run their numbers, and only *if*
> it then turns out that they have a reason to do something else should
> we make this conditional and different on different architectures.
>
> Let's try to keep the code as common as possibly until we have hard
> evidence for special cases, in other words.
>

I did not want to make such a change without redoing the ThunderX2
benchmark, or at least something else arm64-y. I may be able to bench it
tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming
no arm64 people show up with their results.

Even then IMHO the safest route is to patch it out on x86-64 and give
other people time to bench their archs as they get around to it, and
ultimately whack the thing if it turns out nobody benefits from it.
I would say beats backpedaling on the removal, but I'm not going to
fight for it.

That said, does waiting for arm64 numbers and/or producing them for the
removal commit message sound like a plan? If so, I'll post soon(tm).

-- 
Mateusz Guzik 


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread John Paul Adrian Glaubitz

Hello Ard!


Can I take that as an ack on [0]? The EFI subsystem has evolved
substantially over the years, and there is really no way to do any
IA64 testing beyond build testing, so from that perspective, dropping
it entirely would be welcomed.


ia64 is regularly tested in Debian and Gentoo [1][2].

Debian's ia64 porterbox yttrium runs a recent kernel without issues:

root@yttrium:~# uname -a
Linux yttrium 5.19.0-2-mckinley #1 SMP Debian 5.19.11-1 (2022-09-24) ia64 
GNU/Linux
root@yttrium:~#

root@yttrium:~# journalctl -b|head -n10
Nov 14 14:46:10 yttrium kernel: Linux version 5.19.0-2-mckinley 
(debian-ker...@lists.debian.org) (gcc-11 (Debian 11.3.0-6) 11.3.0, GNU ld (GNU 
Binutils for Debian) 2.39) #1 SMP Debian 5.19.11-1 (2022-09-24)
Nov 14 14:46:10 yttrium kernel: efi: EFI v2.10 by HP
Nov 14 14:46:10 yttrium kernel: efi: SALsystab=0xdfdd63a18 ESI=0xdfdd63f18 ACPI 
2.0=0x3d3c4014 HCDP=0xd8798 SMBIOS=0x3d368000
Nov 14 14:46:10 yttrium kernel: PCDP: v3 at 0xd8798
Nov 14 14:46:10 yttrium kernel: earlycon: uart8250 at I/O port 0x4000 (options 
'115200n8')
Nov 14 14:46:10 yttrium kernel: printk: bootconsole [uart8250] enabled
Nov 14 14:46:10 yttrium kernel: ACPI: Early table checksum verification disabled
Nov 14 14:46:10 yttrium kernel: ACPI: RSDP 0x3D3C4014 24 (v02 HP
)
Nov 14 14:46:10 yttrium kernel: ACPI: XSDT 0x3D3C4580 000124 (v01 HP
 RX2800-2 0001  0113)
Nov 14 14:46:10 yttrium kernel: ACPI: FACP 0x3D3BE000 F4 (v03 HP
 RX2800-2 0001 HP   0001)
root@yttrium:~#

Same applies to the buildds:

root@lifshitz:~# uname -a
Linux lifshitz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 
GNU/Linux
root@lifshitz:~#

root@lenz:~# uname -a
Linux lenz 6.0.0-4-mckinley #1 SMP Debian 6.0.8-1 (2022-11-11) ia64 GNU/Linux
root@lenz:~#

EFI works fine as well using the latest version of GRUB2.

Thanks,
Adrian


[1] https://cdimage.debian.org/cdimage/ports/snapshots/
[2] https://mirror.yandex.ru/gentoo-distfiles//releases/ia64/autobuilds/


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Jessica Clarke
On 13 Jan 2023, at 21:03, Luck, Tony  wrote:
> 
>> For what it's worth, Debian and Gentoo both have ia64 ports with active
>> users (6.1 looks like it currently fails to build in Debian due to a
>> minor packaging issue, but various versions of 6.0 were built and
>> published, and one of those is running on the one ia64 Debian builder I
>> personally have access to).
> 
> Jess,
> 
> So dropping ia64 from the upstream kernel won't just save time of kernel
> developers. It will also save time for the folks keeping Debian and Gentoo
> ports up and running.
> 
> Are there people actually running production systems on ia64 that also
> update to v6.x kernels?
> 
> If so, why? Just scrap the machine and replace with almost anything else.
> You'll cover the cost of the new machine in short order with the savings on
> your power bill.

Hobbyists, same as alpha, hppa, m68k, sh and any other such
architectures that have no real use in this day and age.

Jess



RE: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Luck, Tony
> For what it's worth, Debian and Gentoo both have ia64 ports with active
> users (6.1 looks like it currently fails to build in Debian due to a
> minor packaging issue, but various versions of 6.0 were built and
> published, and one of those is running on the one ia64 Debian builder I
> personally have access to).

Jess,

So dropping ia64 from the upstream kernel won't just save time of kernel
developers. It will also save time for the folks keeping Debian and Gentoo
ports up and running.

Are there people actually running production systems on ia64 that also
update to v6.x kernels?

If so, why? Just scrap the machine and replace with almost anything else.
You'll cover the cost of the new machine in short order with the savings on
your power bill.

-Tony


Re: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Jessica Clarke
On Fri, Jan 13, 2023 at 08:55:41AM +0100, Ard Biesheuvel wrote:
> On Fri, 13 Jan 2023 at 01:31, Luck, Tony  wrote:
> >
> > > Yeah, if it was ia64-only, it's a non-issue these days. It's dead and
> > > in pure maintenance mode from a kernel perspective (if even that).
> >
> > There's not much "simultaneous" in the SMT on ia64. One thread in a
> > spin loop will hog the core until the h/w switches to the other thread some
> > number of cycles (hundreds, thousands? I really can remember). So I
> > was pretty generous with dropping cpu_relax() into any kind of spin loop.
> >
> > Is it time yet for:
> >
> > $ git rm -r arch/ia64
> >
> 
> Hi Tony,
> 
> Can I take that as an ack on [0]? The EFI subsystem has evolved
> substantially over the years, and there is really no way to do any
> IA64 testing beyond build testing, so from that perspective, dropping
> it entirely would be welcomed.

For what it's worth, Debian and Gentoo both have ia64 ports with active
users (6.1 looks like it currently fails to build in Debian due to a
minor packaging issue, but various versions of 6.0 were built and
published, and one of those is running on the one ia64 Debian builder I
personally have access to).

Jess


RE: ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-13 Thread Luck, Tony
>> Is it time yet for:
>>
>> $ git rm -r arch/ia64
>>

> Can I take that as an ack on [0]? The EFI subsystem has evolved
> substantially over the years, and there is really no way to do any
> IA64 testing beyond build testing, so from that perspective, dropping
> it entirely would be welcomed.
>
> Thanks,
> Ard.
>
>
>
> [0] 
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=remove-ia64

Yes. EFI isn't the only issue. A bunch of folks[1] have spent time fixing ia64
for (in most cases) some tree-wide patches that they needed. Their time might
have been more productively spent fixing things that actually matter in modern 
times.

Acked-by: Tony Luck 

-Tony

[1] git log --no-merges --since=2year -- arch/ia64 | grep Author: | sort | uniq 
-c | sort -rn
 19 Author: Masahiro Yamada 
 11 Author: Sergei Trofimovich 
  9 Author: Eric W. Biederman 
  8 Author: Arnd Bergmann 
  6 Author: Randy Dunlap 
  6 Author: Kefeng Wang 
  6 Author: Anshuman Khandual 
  5 Author: Masami Hiramatsu 
  5 Author: Al Viro 
  4 Author: Peter Zijlstra 
  4 Author: Mike Rapoport 
  4 Author: David Hildenbrand 
  4 Author: Christophe Leroy 
  3 Author: Yury Norov 
  3 Author: Michal Hocko 
  3 Author: Geert Uytterhoeven 
  3 Author: Bhaskar Chowdhury 
  3 Author: Baoquan He 
  3 Author: Ard Biesheuvel 
  3 Author: Aneesh Kumar K.V 
  2 Author: Yang Guang 
  2 Author: Will Deacon 
  2 Author: Viresh Kumar 
  2 Author: Valentin Schneider 
  2 Author: Stafford Horne 
  2 Author: Sebastian Andrzej Siewior 
  2 Author: Richard Guy Briggs 
  2 Author: Peter Xu 
  2 Author: Peter Collingbourne 
  2 Author: Mark Rutland 
  2 Author: Lukas Bulwahn 
  2 Author: Julia Lawall 
  2 Author: Jens Axboe 
  2 Author: Jason Wang 
  2 Author: Jan Kara 
  2 Author: Christoph Hellwig 
  2 Author: Bjorn Helgaas 
  2 Author: Alexander Lobakin 
  1 Author: Zi Yan 
  1 Author: Zhang Yunkai 
  1 Author: ye xingchen 
  1 Author: xu xin 
  1 Author: Wolfram Sang 
  1 Author: Weizhao Ouyang 
  1 Author: Suren Baghdasaryan 
  1 Author: Souptick Joarder (HPE) 
  1 Author: Sergey Shtylyov 
  1 Author: Sergei Trofimovich 
  1 Author: Sascha Hauer 
  1 Author: Samuel Holland 
  1 Author: Qi Zheng 
  1 Author: Peng Liu 
  1 Author: Naveen N. Rao 
  1 Author: Muchun Song 
  1 Author: Mikulas Patocka 
  1 Author: Mike Kravetz 
  1 Author: Mickaël Salaün 
  1 Author: Matthew Wilcox (Oracle) 
  1 Author: Martin Oliveira 
  1 Author: Luc Van Oostenryck 
  1 Author: Kees Cook 
  1 Author: Jason A. Donenfeld 
  1 Author: Ilpo Järvinen 
  1 Author: Haowen Bai 
  1 Author: Gustavo A. R. Silva 
  1 Author: Greg Kroah-Hartman 
  1 Author: Gaosheng Cui 
  1 Author: Dmitry Osipenko 
  1 Author: Dawei Li 
  1 Author: Chuck Lever 
  1 Author: Christian Brauner 
  1 Author: Chris Down 
  1 Author: Chen Li 
  1 Author: Catalin Marinas 
  1 Author: Benjamin Stürz 
  1 Author: Ben Dooks 
  1 Author: Baolin Wang 
  1 Author: Andy Shevchenko 
  1 Author: André Almeida 


Re: lockref scalability on x86-64 vs cpu_relax

2023-01-13 Thread Peter Zijlstra
On Thu, Jan 12, 2023 at 06:13:16PM -0600, Linus Torvalds wrote:
> On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik  wrote:
> >
> > To my understanding on said architecture failed cmpxchg still grants you
> > exclusive access to the cacheline, making immediate retry preferable
> > when trying to inc/dec unless a certain value is found.
> 
> I actually suspect that is _always_ the case - this is not like a
> contended spinlock where we want to pause because we're waiting for
> the value to change and become unlocked, this cmpxchg loop is likely
> always better off just retrying with the new value.
> 
> That said, the "likely always better off" is purely about performance.
> 
> So I have this suspicion that the reason Tony added the cpu_relax()
> was simply not about performance, but about other issues, like
> fairness in SMT situations.
> 
> That said, evern from a fairness perspective the cpu_relax() sounds a
> bit odd and unlikely - we're literally yielding when we lost a race,
> so it hurts the _loser_, not the winner, and thus might make fairness
> worse too.

I've been writing cmpxchg loops that have strict termination conditions
without cpu_relax() in them for a while now.

For example, the x86 atomic_fetch_and() implementation looks like so:

static __always_inline int arch_atomic_fetch_and(int i, atomic_t *v)
{
int val = arch_atomic_read(v);

do { } while (!arch_atomic_try_cmpxchg(v, , val & i));

return val;
}

And I did that because of the exact same argument you had above, it
needs to do the op anyway, waiting between failed attempts will only
increase the chance it will fail again.


Re: lockref scalability on x86-64 vs cpu_relax

2023-01-13 Thread Will Deacon
On Fri, Jan 13, 2023 at 02:12:50AM +0100, Mateusz Guzik wrote:
> On 1/13/23, Linus Torvalds  wrote:
> > Side note on your access() changes - if it turns out that you can
> > remove all the cred games, we should possibly then revert my old
> > commit d7852fbd0f04 ("access: avoid the RCU grace period for the
> > temporary subjective credentials") which avoided the biggest issue
> > with the unnecessary cred switching.
> >
> > I *think* access() is the only user of that special 'non_rcu' thing,
> > but it is possible that the whole 'non_rcu' thing ends up mattering
> > for cases where the cred actually does change because euid != uid (ie
> > suid programs), so this would need a bit more effort to do performance
> > testing on.
> >
> 
> I don't think the games are avoidable. For one I found non-root
> processes with non-empty cap_effective even on my laptop, albeit I did
> not check how often something like this is doing access().
> 
> Discussion for another time.
> 
> > On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik  wrote:
> >> All that said, I think the thing to do here is to replace cpu_relax
> >> with a dedicated arch-dependent macro, akin to the following:
> >
> > I would actually prefer just removing it entirely and see if somebody
> > else hollers. You have the numbers to prove it hurts on real hardware,
> > and I don't think we have any numbers to the contrary.
> >
> > So I think it's better to trust the numbers and remove it as a
> > failure, than say "let's just remove it on x86-64 and leave everybody
> > else with the potentially broken code"
> >
> [snip]
> > Then other architectures can try to run their numbers, and only *if*
> > it then turns out that they have a reason to do something else should
> > we make this conditional and different on different architectures.
> >
> > Let's try to keep the code as common as possibly until we have hard
> > evidence for special cases, in other words.
> >
> 
> I did not want to make such a change without redoing the ThunderX2
> benchmark, or at least something else arm64-y. I may be able to bench it
> tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming
> no arm64 people show up with their results.
> 
> Even then IMHO the safest route is to patch it out on x86-64 and give
> other people time to bench their archs as they get around to it, and
> ultimately whack the thing if it turns out nobody benefits from it.
> I would say beats backpedaling on the removal, but I'm not going to
> fight for it.
> 
> That said, does waiting for arm64 numbers and/or producing them for the
> removal commit message sound like a plan? If so, I'll post soon(tm).

Honestly, I wouldn't worry about us (arm64) here. I don't think any real
hardware implements the YIELD instruction (i.e. it behaves as a NOP in
practice). The only place I'm aware of where it _does_ something is in
QEMU, which was actually the motivation behind having it in cpu_relax() to
start with (see 1baa82f48030 ("arm64: Implement cpu_relax as yield")).

So, from the arm64 side of the fence, I'm perfectly happy just removing
the cpu_relax() calls from lockref.

Will


ia64 removal (was: Re: lockref scalability on x86-64 vs cpu_relax)

2023-01-12 Thread Ard Biesheuvel
On Fri, 13 Jan 2023 at 01:31, Luck, Tony  wrote:
>
> > Yeah, if it was ia64-only, it's a non-issue these days. It's dead and
> > in pure maintenance mode from a kernel perspective (if even that).
>
> There's not much "simultaneous" in the SMT on ia64. One thread in a
> spin loop will hog the core until the h/w switches to the other thread some
> number of cycles (hundreds, thousands? I really can remember). So I
> was pretty generous with dropping cpu_relax() into any kind of spin loop.
>
> Is it time yet for:
>
> $ git rm -r arch/ia64
>

Hi Tony,

Can I take that as an ack on [0]? The EFI subsystem has evolved
substantially over the years, and there is really no way to do any
IA64 testing beyond build testing, so from that perspective, dropping
it entirely would be welcomed.

Thanks,
Ard.



[0] 
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=remove-ia64


Re: lockref scalability on x86-64 vs cpu_relax

2023-01-12 Thread Nicholas Piggin
On Fri Jan 13, 2023 at 2:15 PM AEST, Linus Torvalds wrote:
> On Thu, Jan 12, 2023 at 9:20 PM Nicholas Piggin  wrote:
> >
> > Actually what we'd really want is an arch specific implementation of
> > lockref.
>
> The problem is mainly that then you need to generate the asm versions
> of all those different CMPXCHG_LOOP()  variants.
>
> They are all fairly simple, though, and it woudln't be hard to make
> the current lib/lockref.c just be the generic fallback if you don't
> have an arch-specific one.

Yeah, it doesn't look too onerous so it's probably worth seeing what
the code and some numbers look like here.

> And even if you do have the arch-specific LL/SC version, you'd still
> want the generic fallback for the case where a spinlock isn't a single
> word any more (which happens when the spinlock debugging options are
> on).

You're right, good point.

Thanks,
Nick


Re: lockref scalability on x86-64 vs cpu_relax

2023-01-12 Thread Linus Torvalds
On Thu, Jan 12, 2023 at 9:20 PM Nicholas Piggin  wrote:
>
> Actually what we'd really want is an arch specific implementation of
> lockref.

The problem is mainly that then you need to generate the asm versions
of all those different CMPXCHG_LOOP()  variants.

They are all fairly simple, though, and it woudln't be hard to make
the current lib/lockref.c just be the generic fallback if you don't
have an arch-specific one.

And even if you do have the arch-specific LL/SC version, you'd still
want the generic fallback for the case where a spinlock isn't a single
word any more (which happens when the spinlock debugging options are
on).

Linus


Re: lockref scalability on x86-64 vs cpu_relax

2023-01-12 Thread Linus Torvalds
On Thu, Jan 12, 2023 at 7:12 PM Mateusz Guzik  wrote:
>
> I did not want to make such a change without redoing the ThunderX2
> benchmark, or at least something else arm64-y. I may be able to bench it
> tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming
> no arm64 people show up with their results.

I don't think ThunderX2 itself is particularly interesting, but sure,
it would be good to have numbers for some modern arm64 cores.

The newer Amazon EC2 cores (Graviton 2/3) sound more relevant (or
Ampere?)  The more different architecture numbers we'd have for that
"remove cpu_relax()", the better.

   Linus


Re: lockref scalability on x86-64 vs cpu_relax

2023-01-12 Thread Nicholas Piggin
On Fri Jan 13, 2023 at 10:13 AM AEST, Linus Torvalds wrote:
> [ Adding linux-arch, which is relevant but not very specific, and the
> arm64 and powerpc maintainers that are the more specific cases for an
> architecture where this might actually matter.
>
>   See
>
> 
> https://lore.kernel.org/all/CAGudoHHx0Nqg6DE70zAVA75eV-HXfWyhVMWZ-aSeOofkA_=w...@mail.gmail.com/
>
>   for original full email, but it might be sufficiently clear just
> from this heavily cut-down context too ]
>
> Side note on your access() changes - if it turns out that you can
> remove all the cred games, we should possibly then revert my old
> commit d7852fbd0f04 ("access: avoid the RCU grace period for the
> temporary subjective credentials") which avoided the biggest issue
> with the unnecessary cred switching.
>
> I *think* access() is the only user of that special 'non_rcu' thing,
> but it is possible that the whole 'non_rcu' thing ends up mattering
> for cases where the cred actually does change because euid != uid (ie
> suid programs), so this would need a bit more effort to do performance
> testing on.
>
> On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik  wrote:
> >
> > To my understanding on said architecture failed cmpxchg still grants you
> > exclusive access to the cacheline, making immediate retry preferable
> > when trying to inc/dec unless a certain value is found.
>
> I actually suspect that is _always_ the case - this is not like a
> contended spinlock where we want to pause because we're waiting for
> the value to change and become unlocked, this cmpxchg loop is likely
> always better off just retrying with the new value.

Yes this should be true for powerpc (POWER CPUs, at least).

> That said, the "likely always better off" is purely about performance.
>
> So I have this suspicion that the reason Tony added the cpu_relax()
> was simply not about performance, but about other issues, like
> fairness in SMT situations.
>
> That said, evern from a fairness perspective the cpu_relax() sounds a
> bit odd and unlikely - we're literally yielding when we lost a race,
> so it hurts the _loser_, not the winner, and thus might make fairness
> worse too.

Worse is that we've also actually just *won* a race when the cmpxchg
comes back, i.e., to get the line exclusive in our cache line. Then
we'll just sit there waiting and probably holding off other snoopers
for a while.

I don't see much of a fairness concern really. If there's a lot of
contention here we'll be stalled for a long time waiting on the line,
so SMT heuristics had better send resources to other threads making
better progress anyway as it should for any cache miss situation. So
this loop shouldn't be hogging up a lot of resources from the other
thread(s).

> I dunno.  Tony may have some memory of what the issue was.
>
> > ... without numbers attached to it. Given the above linked thread it
> > looks like the arch this was targeting was itanium, not x86-64, but
> > the change landed for everyone.
>
> Yeah, if it was ia64-only, it's a non-issue these days. It's dead and
> in pure maintenance mode from a kernel perspective (if even that).
>
> > Later it was further augmented with:
> > commit 893a7d32e8e04ca4d6c882336b26ed660ca0a48d
> > Author: Jan Glauber 
> > Date:   Wed Jun 5 15:48:49 2019 +0200
> >
> > lockref: Limit number of cmpxchg loop retries
> > [snip]
> > With the retry limit the performance of an open-close testcase
> > improved between 60-70% on ThunderX2.
> >
> > While the benchmark was specifically on ThunderX2, the change once more
> > was made for all archs.
>
> Actually, in that case I did ask for the test to be run on x86
> hardware too, and exactly like you found:
>
> > I should note in my tests the retry limit was never reached fwiw.
>
> the max loop retry number just isn't an issue. It fundamentally only
> affects extremely unfair platforms, so it's arguably always the right
> thing to do.
>
> So it may be "ThunderX2 specific" in that that is where it was
> noticed, but I think we can safely just consider the max loop thing to
> be a generic safety net that hopefully simply never triggers in
> practice on any sane platform.
>

If there are a lot of threads contending I'm sure x86, POWER, probably
most CPUs could quite possibly starve for hundreds if not thousands or
more of iterations here.

And I'm not really a fan of scattering random implementation specific
crutches ad hoc throughout our primitives. At least it could be specific
to the arch where it matters.

Interesting that improves performance so much though. I wonder why?
Hitting the limit will take the lock and that will cause all other CPUs
to drop out of the "fast" path so it will degenerate to a spinlock.
queued spinlock is pretty scalable but it really shouldn't be more
scalable than an atomic OP. I bet this cpu_relax isn't helping, and
probably ll/sc implementation of cmpxchg primitive doesn't help either.

I reckon if you removed the cpu_relax there, big x86 systems 

Re: lockref scalability on x86-64 vs cpu_relax

2023-01-12 Thread Linus Torvalds
On Thu, Jan 12, 2023 at 6:31 PM Luck, Tony  wrote:
>
> There's not much "simultaneous" in the SMT on ia64.

Oh, I forgot about the whole SoEMT fiasco.

Yeah, that might make ia64 act a bit differently here.

But I don't think anybody cares any more, so I don't think that merits
making this a per-architecture choice.

The s390 people hated cpu_relax() here, but for them it was really
because it was bad *everywhere*, and they just made it a no-op (see
commit 22b6430d3665 "locking/core, s390: Make cpu_relax() a barrier
again"). There had been a (failed) attempt at "cpu_relax_lowlatency()"
for the s390 issues.

  Linus


RE: lockref scalability on x86-64 vs cpu_relax

2023-01-12 Thread Luck, Tony
> Yeah, if it was ia64-only, it's a non-issue these days. It's dead and
> in pure maintenance mode from a kernel perspective (if even that).

There's not much "simultaneous" in the SMT on ia64. One thread in a
spin loop will hog the core until the h/w switches to the other thread some
number of cycles (hundreds, thousands? I really can remember). So I
was pretty generous with dropping cpu_relax() into any kind of spin loop.

Is it time yet for:

$ git rm -r arch/ia64

-Tony


Re: lockref scalability on x86-64 vs cpu_relax

2023-01-12 Thread Linus Torvalds
[ Adding linux-arch, which is relevant but not very specific, and the
arm64 and powerpc maintainers that are the more specific cases for an
architecture where this might actually matter.

  See


https://lore.kernel.org/all/CAGudoHHx0Nqg6DE70zAVA75eV-HXfWyhVMWZ-aSeOofkA_=w...@mail.gmail.com/

  for original full email, but it might be sufficiently clear just
from this heavily cut-down context too ]

Side note on your access() changes - if it turns out that you can
remove all the cred games, we should possibly then revert my old
commit d7852fbd0f04 ("access: avoid the RCU grace period for the
temporary subjective credentials") which avoided the biggest issue
with the unnecessary cred switching.

I *think* access() is the only user of that special 'non_rcu' thing,
but it is possible that the whole 'non_rcu' thing ends up mattering
for cases where the cred actually does change because euid != uid (ie
suid programs), so this would need a bit more effort to do performance
testing on.

On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik  wrote:
>
> To my understanding on said architecture failed cmpxchg still grants you
> exclusive access to the cacheline, making immediate retry preferable
> when trying to inc/dec unless a certain value is found.

I actually suspect that is _always_ the case - this is not like a
contended spinlock where we want to pause because we're waiting for
the value to change and become unlocked, this cmpxchg loop is likely
always better off just retrying with the new value.

That said, the "likely always better off" is purely about performance.

So I have this suspicion that the reason Tony added the cpu_relax()
was simply not about performance, but about other issues, like
fairness in SMT situations.

That said, evern from a fairness perspective the cpu_relax() sounds a
bit odd and unlikely - we're literally yielding when we lost a race,
so it hurts the _loser_, not the winner, and thus might make fairness
worse too.

I dunno.  Tony may have some memory of what the issue was.

> ... without numbers attached to it. Given the above linked thread it
> looks like the arch this was targeting was itanium, not x86-64, but
> the change landed for everyone.

Yeah, if it was ia64-only, it's a non-issue these days. It's dead and
in pure maintenance mode from a kernel perspective (if even that).

> Later it was further augmented with:
> commit 893a7d32e8e04ca4d6c882336b26ed660ca0a48d
> Author: Jan Glauber 
> Date:   Wed Jun 5 15:48:49 2019 +0200
>
> lockref: Limit number of cmpxchg loop retries
> [snip]
> With the retry limit the performance of an open-close testcase
> improved between 60-70% on ThunderX2.
>
> While the benchmark was specifically on ThunderX2, the change once more
> was made for all archs.

Actually, in that case I did ask for the test to be run on x86
hardware too, and exactly like you found:

> I should note in my tests the retry limit was never reached fwiw.

the max loop retry number just isn't an issue. It fundamentally only
affects extremely unfair platforms, so it's arguably always the right
thing to do.

So it may be "ThunderX2 specific" in that that is where it was
noticed, but I think we can safely just consider the max loop thing to
be a generic safety net that hopefully simply never triggers in
practice on any sane platform.

> All that said, I think the thing to do here is to replace cpu_relax
> with a dedicated arch-dependent macro, akin to the following:

I would actually prefer just removing it entirely and see if somebody
else hollers. You have the numbers to prove it hurts on real hardware,
and I don't think we have any numbers to the contrary.

So I think it's better to trust the numbers and remove it as a
failure, than say "let's just remove it on x86-64 and leave everybody
else with the potentially broken code"

Because I do think that a cmpxchg loop that updates the value it
compares and exchanges is fundamentally different from a "busy-loop,
trying to read while locked", and with your numbers as ammunition, I
think it's better to just remove that cpu_relax() entirely.

Then other architectures can try to run their numbers, and only *if*
it then turns out that they have a reason to do something else should
we make this conditional and different on different architectures.

Let's try to keep the code as common as possibly until we have hard
evidence for special cases, in other words.

 Linus