Hi Midgy, This is a great start, congrats. Please see below:
On Thu, Jun 4, 2026 at 4:11 PM Midgy BALON <[email protected]> wrote: > > Hi, > > I'm adding kernel support for the RK3568 NPU to the rocket accel driver > (RFC just posted to dri-devel / linux-rockchip [1]). The RK3568 NPU is the > same NVDLA-derived IP as the RK3588 that the rocket Mesa/Teflon driver > targets, but on RK3568 a simple conv2d through Teflon produces wrong output > (every element is the output zero-point), and the command list Mesa emits > differs from the vendor (RKNPU) one for the same convolution. > > Comparing Mesa rocket's CNA config to a byte-exact RKNPU capture for the > *same* convolution (conv2d, 80x80x16 -> 40x40x128, 5x5 stride 2, int8) on > RK3568, several values look RK3588-specific: > > - CONV_CON4 (0x1018): Mesa 0x00019000 ; vendor 0 for this conv > - line_stride (0x107c): Mesa 0x140 ; vendor 0x50 (4x too large) > - CONV_CON2 KERNEL_GROUP: Mesa 0x00030080 ; vendor 0x70 > - DMA_CON0 fetch_pixel_len, CBUF_CON1, and several DPU/ACCU/DPU_RDMA > registers also differ. > > Structurally, rkt_ml.h assumes the RK3588 CBUF size (RK3568 has 256 KiB / > 8 banks, not 12), and the DCOMP descriptor layout differs between the two > (RK3568: 8 ADDR + 8 AMOUNT registers; RK3588: 1 ADDR + 16 AMOUNT). The > conv lowering in rkt_regcmd.c has essentially one RK3568 reference. > > Important caveat so this isn't misread: replaying the vendor's byte-exact > command list through the mainline rocket *kernel* driver on RK3568 also does > not execute (the MAC/output stage never completes), so the command-list > content is not the only blocker -- there is a hardware/driver bring-up issue > I'm still chasing on the kernel side. Feel free to ping me in #ml-mainline on OFTC. At some point with rk3588 enablement I was in the same position and maybe I can share some useful tricks to get pass this. > I'm sending this so the userspace > config divergences are on record for RK3568 enablement, Would you be able to start a Draft merge request at https://gitlab.freedesktop.org/mesa/mesa/ with this? May make it easier to collaborate, even if I don't have hardware with rk3568. > and to ask: > > - Is per-SoC config generation (rkt_regcmd.c / rkt_ml.h parameterised by > SoC, as the kernel side now is) the direction you'd want for RK3568? Sure, but the specific way how I would like to parameterize this depends on the volume of differences between HW revisions. > - Are there known RK3588-specific assumptions baked into the conv lowering > that I should look at first? I'm afraid there could be lots of RK3588 assumptions. I basically started hardcoding a cmd stream, then parameterizing as I enabled more and more operations. > Happy to provide the full register-by-register diff and the captures. Yes, I would start by coming up with the simplest workload we can submit and focus on getting that working. I would start by comparing all payloads passed to the HW, and if that's not enough, then all register writes from the kernel to the relevant addresses. Cheers, Tomeu > Hardware: Radxa ROCK 3B (RK3568). Mesa 25.3.0. Test: conv2d.tflite via the > Teflon delegate. Kernel: rocket with the RFC RK3568 series applied. > > [1] https://lore.kernel.org/linux-rockchip/?q=Add+RK3568+NPU+support > > Thanks, > Midgy BALON
