> Any chance you could upgrade to either the Debian kernel or one based
> on the Debian kernel or on 4.19 mainline and see if any of them fix it?
>
> Also, I suggest you try to get those quirks fixed in Linux mainline,
> so that you don't have to keep building Linux yourself :)

I'm afraid upgrading would be difficult. The use-case is one of fairly
entrenched networking for my company, so there's significant
regression-testing required to even move further within the same minor
version.

The quirks are also somewhat difficult to generalise: some of the
hardware is rather antiquated, requiring driver hacks for
initialisation, and many of the other changes are similarly niche and
intended only as case-specific optimisations. (It's divergent in much
the same way that much of the OpenWRT project's kernel patches will
never be suitable for mainline, but we do submit bugfixes and stuff
when we can)

I was able to confirm that the problem doesn't occur with Debian's
4.19 series on unrelated hardware, however, and it looks like the
problem has been resolved for a while in stock kernels.


> Do you have any details about which patch this is?

https://lists.ubuntu.com/archives/kernel-team/2018-May/092723.html

I wasn't able to find an equivalent in Debian's patchsets, which led
me to check upstream; more on that below.


> Also, it would be great if you could try to get the patch into the
> Linux kernel's mainline stable releases.

After digging around for a while, it looks like this may be a
side-effect of how LTSI works, though I'll report it anyway, in hopes
that it can be addressed without violating the "don't break the
userspace ABI" policy. Any semi-recent kernel version should be
unaffected.

The following patch adds a "NoNewPrivs" line to /proc/<pid>/status,
where the blank appears in 4.4. It doesn't look like it was ever
backported into the tree, or, rather, it seems to be the case that
retpoline logic, which assumed there was text right after the
capabilities block, was backported directly, leading to the gap.

NoNewPrivs patch:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=af884cd4a5ae62fcf5e321fecf0ec1014730353d


In any case, this is definitely an issue that should be fixed in
(specific versions of) the kernel. I still think iotop should be
synced to gain tolerance to unexpected input, but it isn't a
Debian-specific problem in light of what's been discussed here.


Thanks for the sanity-checks here.

-Neil

Reply via email to