On 19/10/21 8:59 am, Joel Sherrill wrote: > On Mon, Oct 18, 2021 at 4:28 PM Chris Johns <chr...@rtems.org> wrote: >> >> On 19/10/21 3:53 am, Kinsey Moore wrote: >>> On 10/18/2021 00:44, Chris Johns wrote: >>>> Hi, >>>> >>>> I cannot run libbsd on real hardware because the cadence rx descriptor >>>> cache >>>> coherent allocation crashes in `memset`. It is used to clear the memory. >>>> >>>> The rtemsbsd allocator call optionally clears the memory and it seems the >>>> newlib >>>> aarch64 memset code crashes when doing this. A basic loop with 8bit or >>>> 32bit >>>> writes does not crash. The memset call happily clears an array in cached >>>> memory >>>> with different offsets. >>>> >>>> I have posted a patch to spcache01 that generates the crash on Versal and >>>> ZynqMP >>>> hardware. The crash dump is: >>>> >>>> test cache coherent allocation >>>> clear cache coherent with memset: 0x1fe00050 >>>> >>>> >>>> *** FATAL *** >>>> fatal source: 9 (RTEMS_FATAL_SOURCE_EXCEPTION) >>>> >>>> >>>> X0 = 0x000000001fe00050 X17 = 0x000000000000000c >>>> X1 = 0x0000000000000000 X18 = 0x00000000100007b0 >>>> X2 = 0x0000000000000110 X19 = 0x000000001fe00050 >>>> X3 = 0x000000001fe000c0 X20 = 0x000000001fdfff80 >>>> X4 = 0x000000001fe00250 X21 = 0x0000000010013ab0 >>>> X5 = 0x0000000000000004 X22 = 0x0000000000000000 >>>> X6 = 0x0000000000000001 X23 = 0x00000000ffffffff >>>> X7 = 0x0000000000000000 X24 = 0x0000000010103140 >>>> X8 = 0x0000000000000000 X25 = 0x0000000000000000 >>>> X9 = 0xffffff80ffffffc8 X26 = 0x0000000000000000 >>>> X10 = 0x0000000000000000 X27 = 0x0000000000000000 >>>> X11 = 0x000000001010ca78 X28 = 0x0000000000000000 >>>> X12 = 0x0000000000000001 FP = 0x000000001010cc30 >>>> X13 = 0x000000001fe00050 LR = 0x0000000010001f94 >>>> X14 = 0x0000000000000000 SP = 0x000000001010cc30 >>>> X15 = 0x0000000000000004 PC = 0x00000000100125c0 >>>> X16 = 0x000000001000f700 DAIF = 0x00000000000003c0 >>>> VEC = 0x0000000000000004 CPSR = 0x0000000060000005 >>>> ESR = EC: 0b100101 IL: 0b1 ISS: 0b0000000000000000001100001 >>>> Data Abort taken without a change in Exception level >>>> FAR = 0x000000001fe000c0 >>>> FPCR = 0x0000000000000000 FPSR = 0x0000000000000010 >>>> >>>> The Versal (A72) fails in exactly the same way. The allocated address is >>>> 0x1fe00050 and the FAR is 0x1fe000c0 so I am not sure if the "0xc0 - 0x50" >>>> section is aligning the pointer to a larger word size for better >>>> performance and >>>> that first part is OK but the different word size breaks. >>> >>> I'm running with a toolchain that was built with >>> --targetcflags="-DPREFER_SIZE_OVER_SPEED" which affects the content of the >>> memset function, so my memset is just loops of writes and seems to work >>> fine. >> >> Oh. Maybe the eng manual needs a piece on this. Using flags on tool chains >> like >> this is fine for a user because it is use at your own peril however I believe >> patches need to be tested with the defaults for all tools. It is way to hard >> to >> baseline a BSP if tweaks are needed here and there. > > We did try to merge this to the RSB as a temporary workaround for ilp32 > issues. > Kinsey may have realized it had this impact also but I don't recall > being aware of it.
Sure and we need to accommodate this but I think as a policy we need to make sure patches are tested with default tool sets. I cannot see how we can make things work without having this happen? > We didn't want it to be a local hack. :) It may have to be just that. It seem to me we have an IPL32 BSP that needs a special set of tools and that constrains any other aarch64 BSPs if it became the default. Do we want that? If the cached memory gets a performance boost from a better memset, memcpy etc then I hope that is available to me by default. >>> Just out of curiousity, what instruction was at that PC address? If it was >>> "dc >>> zva", then I had seen this a while back during initial AArch64 bringup and >>> had >>> assumed it was fixed since the addition of the MMU code since that >>> instruction >>> doesn't work on device memory. >> >> It this that instruction ... >> >> 100125c0: d50b7423 dc zva, x3 >> >> Looks like it is not fixed. > > I think your suggestion that FreeBSD should not use memset for device memory > is the right path though. But that could be in a lot of places. :( I do not know. The allocation is under the bus space DMA allocator and that interface is complicated. Maybe memset is not suitable? Chris _______________________________________________ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel