On 2026/1/21 15:24, Qingfang Deng wrote: > On Tue, 20 Jan 2026 14:58:50 +0800, Feng Jiang wrote: >> diff --git a/arch/riscv/lib/strnlen.S b/arch/riscv/lib/strnlen.S > > Branches that test maxlen can be replaced with Zbb minu instruction. > (see below) >
... >> + /* >> + * The first chunk is special: compare against the number >> + * of valid bytes in this chunk. >> + */ >> + srli a0, t1, 3 >> + >> + /* Limit the result by maxlen. */ >> + bleu a1, a0, 3f > > minu a0, a0, a1 > >> + >> + bgtu t3, a0, 2f >> + >> + /* Prepare for the word comparison loop. */ >> + addi t2, t0, SZREG >> + li t3, -1 >> + >> + /* >> + * Our critical loop is 4 instructions and processes data in >> + * 4 byte or 8 byte chunks. >> + */ >> + .p2align 3 >> +1: >> + REG_L t1, SZREG(t0) >> + addi t0, t0, SZREG >> + orc.b t1, t1 >> + bgeu t0, t4, 4f >> + beq t1, t3, 1b >> +4: >> + not t1, t1 >> + CZ t1, t1 >> + srli t1, t1, 3 >> + >> + /* Get number of processed bytes. */ >> + sub t2, t0, t2 >> + >> + /* Add number of characters in the first word. */ >> + add a0, a0, t2 >> + >> + /* Add number of characters in the last word. */ >> + add a0, a0, t1 >> + >> + /* Ensure the final result does not exceed maxlen. */ >> + bgeu a0, a1, 3f > > minu a0, a0, a1 > Thanks for the great suggestion! I see your point now—using minu is indeed a much more elegant and efficient way to handle the maxlen constraint. It nicely eliminates unnecessary branches and simplifies the code while still allowing for early returns. I'll incorporate this into a v4 patch and add a Suggested-by tag for you. Thanks again for your insightful review! -- With Best Regards, Feng Jiang
