https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80479
--- Comment #10 from acsawdey at gcc dot gnu.org --- OK, so I'm the culprit who added the strncmp/strcmp inline expansion. If both strings have alignment > 8 we cannot inadvertently cross a page boundary doing 8B loads. For any argument that has smaller alignment, it emits a runtime check to see if the inline code would cross a 4k boundary. If so, the library function call is used instead of the inline code. The testcase gcc.dg/strncmp-2.c tests that we don't step over this by allocating 2 pages and using mprotect PROT_NONE on the second, then trying to provoke things by putting strings right up to the boundary. The code generation for strncmp/strcmp is done by the same code in rs6000.c so testing strncmp for this mostly also tests whether strcmp has any issues. The generated comparison code, while it does use 8B loads, also makes use of cmpb to make sure that data beyond the 0 byte is not significant in the result. Startup: load two doublewords, are they equal? ldbrx 9,28,10 ldbrx 10,30,10 subf. 3,10,9 beq 0,.L23 If they are, go to this piece that looks to see if there was a zero byte: .L23: cmpb 10,9,3 cmpdi 7,10,0 beq 7,.L22 If we don't branch, the strings are equal, result of zero is in r3 and we are done. If we didn't branch to .L23 above, we fall through to this piece that computes the final result by finding the correct differing byte and subtracting: .L11: cmpb 3,9,10 cmpb 8,9,26 addi 31,31,1 orc 3,8,3 cntlzd 3,3 addi 3,3,8 rldcl 9,9,3,56 rldcl 3,10,3,56 subf 3,3,9 extsw 9,3 If we did go to .L22 then we have a repeating sequence like this to load and compare 8B at a time: .L22: addi 9,8,8 addi 10,4,8 ldbrx 9,0,9 ldbrx 10,0,10 subf. 3,10,9 bne 0,.L11 cmpb 10,9,3 cmpdi 7,10,0 bne 7,.L10 Here we either go to the L11 piece to extract the differing bytes and subtract, or we found a zero byte and strings are equal (r3=0) and bail out to L10. At the end of our 64 bytes of inline comparison we bail out to strcmp: addi 4,4,64 addi 3,8,64 bl strcmp So, yes we do read 8B at a time, but the code makes use of cmpb so that the bytes following the zero byte are never significant to the comparison. On the other hand I've already had to fix this a couple times so it is certainly possible that errors remain -- please do let me know if you see something.