Hello,
since there has been a lot of recent activity around the Linux kernel on SPARC
and there are also a lot of issues to be dealt with and unmerged patches, I have
decided to summarize the current state of the Linux kernel on SPARC to bring
anyone interested up to date.
First, let's start with the bugs. For a while it has been known that recent
kernel
versions can be very unstable on certain SPARC machines, this has been observed
in
particular with UltraSPARC III CPUs but also on certain newer CPUs such as
SPARC T1.
After I started bisecting the issue, I ran into multiple false positives until I
identified d53d2f78cead as the culprit which makes use of a new vmalloc flag
called
VM_FLUSH_RESET_PERMS.
However, this particular change is actually not broken but rather just
uncovered the
original bug. The introduction of VM_FLUSH_RESET_PERMS allowed the kernel to
flush
TLBs earlier after booting and more often. And since the original problem was
suspected
with the TLB flush management on SPARC, it was just natural that the change in
d53d2f78cead
turned out in the bisect.
Further investigation showed that the actual culprit are the CPU-specific
implementations
of copy_{from,to}_user which can be found in arch/sparc/lib. These are broken
on different
CPU types to a different degree which explains perfectly fine why recent
kernels are more
unstable on certain CPU types than on others.
Luckily, Michael Karcher has already made good progress in investigating and
fixing these
bugs so that, for example, a trial patch for the UltraSPARC III showed that a
simple
one-line change would fix all the stability issues currently seen on these CPU
types.
It is expected that a series of patches will follow shortly that will address
the bugs in
the copy_{from,to}_user on all affected CPU types. In the mean time, it should
be possible
to switch the kernel to the generic code for copy_{from,to}_user that can be
found in the
same source directory to get a stable system on any CPU type.
Another issue that was discovered was that support for HugeTLB was broken on
sun4u. A patch
addressing the problem has been posted by Anthony Yznaga in [1]. Additional
pending patches
fix the error handling in the scan_one_device() [2] and switch sparc64 to the
generic vDSO
library [3].
Once the stability issues have been fixed, the focus should be on upstreaming
feature patches
that Oracle engineers developed but never sent in for review. These can be
found in Oracle's
Github repository for the UEK kernel in the uek4/qu7 branch [4].
These feature patches include useful additions such as support for kexec [5],
5-level page
table support [6], EFI support for newer servers [7], support for SPARC M8 [8],
fixed for
SPARC M7 [9] and even support for running the Linux kernel as a primary LDOM
[10] and many
other improvements.
So, there is definitely a lot of work to be done on Linux for SPARC so that
we're going to
be busy for some more years. Hopefully, some folk from Oracle can step in and
help upstreaming
some of the patches of Oracle's UEK kernel. Primary domain support in
particular would be
very nice to have on Linux as this would allow creating logical domains on
sun4v hardware
without having to install Solaris.
Cheers,
Adrian
> [1]
> https://lore.kernel.org/all/[email protected]/
> [2] https://lore.kernel.org/all/[email protected]/
> [3]
> https://lore.kernel.org/all/[email protected]/
> [4] https://github.com/oracle/linux-uek/tree/uek4/qu7/
> [5]
> https://github.com/oracle/linux-uek/commit/6fa4477f7e671b4882517a0862d3ee3f65ff4bde
> (there are multiple patches for kexec)
> [6]
> https://github.com/oracle/linux-uek/commit/9783abbe2d19da0d36a2b1caa4b15d965ee68384
> [7]
> https://github.com/oracle/linux-uek/commit/127ca6582a90567ded4fa6168c1582d2d5ac37f0
> [8]
> https://github.com/oracle/linux-uek/commit/5fe100ac31a6f977ebb64ce4eea7b0e3de7dbe04
> [9]
> https://github.com/oracle/linux-uek/commit/efcafbab1b123d615c1f2683c98fccc5ccee1527
> [10]
> https://github.com/oracle/linux-uek/commit/6c87154b63230bc5e35c5df133e7ecfadf47b828
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913