On 04/02/2021 10:27, Ard Biesheuvel wrote: > On Thu, 4 Feb 2021 at 11:06, Russell King - ARM Linux admin > <li...@armlinux.org.uk> wrote: >> >> On Thu, Feb 04, 2021 at 10:07:58AM +0100, Ard Biesheuvel wrote: >>> On Thu, 4 Feb 2021 at 09:43, Guillaume Tucker >>> <guillaume.tuc...@collabora.com> wrote: >>>> >>>> Hi Ard, >>>> >>>> Please see the bisection report below about a boot failure on >>>> rk3288 with next-20210203. It was also bisected on >>>> imx6q-var-dt6customboard with next-20210202. >>>> >>>> Reports aren't automatically sent to the public while we're >>>> trialing new bisection features on kernelci.org but this one >>>> looks valid. >>>> >>>> The kernel is most likely crashing very early on, so there's >>>> nothing in the logs. Please let us know if you need some help >>>> with debugging or trying a fix on these platforms. >>>> >>> >>> Thanks for the report. >> >> Ard, >> >> I want to send my fixes branch today which includes your regression >> fix that caused this regression. >> >> As this is proving difficult to fix, I can only drop your fix from >> my fixes branch - and given that this seems to be problematical, I'm >> tempted to revert the original change at this point which should fix >> both of these regressions - and then we have another go at getting rid >> of the set/way instructions during the next cycle. >> >> Thoughts? >> > > Hi Russell, > > If Guillaume is willing to do the experiment, and it fixes the issue,
Yes, I'm running some tests with that fix now and should have some results shortly. > it proves that rk3288 is relying on the flush before the MMU is > disabled, and so in that case, the fix is trivial, and we can just > apply it. > > If the experiment fails (which would mean rk3288 does not tolerate the > cache maintenance being performed after cache off), it is going to be > hairy, and so it will definitely take more time. > > So in the latter case (or if Guillaume does not get back to us), I > think reverting my queued fix is the only sane option. But in that > case, may I suggest that we queue the revert of the original by-VA > change for v5.12 so it gets lots of coverage in -next, and allows us > an opportunity to come up with a proper fix in the same timeframe, and > backport the revert and the subsequent fix as a pair? Otherwise, we'll > end up in the situation where v5.10.x until today has by-va, v5.10.x-y > has set/way, and v5.10y+ has by-va again. (I don't think we care about > anything before that, given that v5.4 predates any of this) > > But in the end, I'm happy to go along with whatever works best for you. Thanks, Guillaume