On Tue, Oct 17, 2023 at 12:21 AM Marc Zyngier <m...@kernel.org> wrote: > > On Mon, 16 Oct 2023 02:42:08 +0100, > Chris Packham <judge.pack...@gmail.com> wrote: > > > > On Sun, Oct 15, 2023 at 10:29 AM Chris Packham <judge.pack...@gmail.com> > > wrote: > > > > > > > > > > > > On Sat, 14 Oct 2023, 11:04 am Marc Zyngier, <m...@kernel.org> wrote: > > >> > > >> On 2023-10-13 03:40, Chris Packham wrote: > > >> > Hi Marc, Paul, > > >> > > > >> > On Sat, Mar 18, 2023 at 5:23 AM Ying-Chun Liu (PaulLiu) > > >> > <paul....@linaro.org> wrote: > > >> >> > > >> >> From: Marc Zyngier <m...@kernel.org> > > >> >> > > >> >> Some recent arm64 cores have a facility that allows the page > > >> >> table walker to track the dirty state of a page. This makes it > > >> >> really efficient to perform CMOs by VA as we only need to look > > >> >> at dirty pages. > > >> >> > > >> >> Signed-off-by: Marc Zyngier <m...@kernel.org> > > >> >> [ Paul: pick from the Android tree. Rebase to the upstream ] > > >> >> Signed-off-by: Ying-Chun Liu (PaulLiu) <paul....@linaro.org> > > >> >> Cc: Tom Rini <tr...@konsulko.com> > > >> >> Link: > > >> >> https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6b2edd5ec3d4a504794887263 > > >> > > > >> > I think this may have caused a regression for the Marvell AC5X > > >> > board(s). I found that v2023.07 locked up at boot but v2023.01 was > > >> > fine. The lockup seemed to be in the 'Net:' init probably just as the > > >> > mvneta driver was being initialised. > > >> > > > >> > A git bisect led me to this change although for this specific change > > >> > instead of the lockup I get a crash so maybe I'm actually hitting a > > >> > different issue. > > >> > > > >> > Any thoughts as to why this may have caused problems? > > >> > > >> Not really. What CPUs does this platform have? What is the offending > > >> driver doing to trigger the issue? Can you provide some level of > > >> tracing? > > > > > > > > > The Marvell AC5X is a network switch ASIC with an integrated ARMv8 CPU > > > (8.1 specifically I think). > > > > > > I think there is something that the mvneta driver is doing triggering the > > > issue. I have another AC5X based board without an Ethernet port that > > > boots just fine (this is also why I didn't notice earlier). > > > > > > I'll try and get some more debug out when I'm back in the office > > > > > > > The thing the mvneta driver does that upsets things appears to be > > > > mmu_set_region_dcache_behaviour((phys_addr_t)bd_space, BD_SPACE, > > DCACHE_OFF); > > > > I can comment that line out and everything works. > > This leads to two questions: > > - is the device cache coherent, in which case it doesn't need the > memory being non-cacheable? If everything is OK, then why the switch > to device memory?
I'll be honest and say I understand less than 50% of that. The network transfer does seem to work without the call so perhaps the device is cache coherent but this seems to be a common thing in many drivers so I'd assume that on such platforms this should be innocuous. It's totally possible I haven't done a good job of setting up the CPU or informing the rest of the system about it. I did just take a lot of the code from the Marvell SDK and clean it up without really understanding what most of it did. > > - what goes wrong when these attributes are applied? do we have to > split a block mapping? > > Instrumenting the MMU code would certainly help understanding what > goes wrong here. I did do that a little bit. At first I thought there was a possible infinite loop in mmu_set_region_dcache_behaviour(). Squinting at things you could naively say that if set_one_region() failed to find an entry then it would loop forever but if that happened I'd have some debug saying that it failed. Things seem to go south after __asm_switch_ttbr(gd->arch.tlb_emerg) which did get me thinking that perhaps the emergency tables aren't setup (or at least aren't set up in a way that allows debug output). That's about as far as I got debugging wise, I'll try and spend some more time digging into the MMU code. > > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible.