Adding Simon who wrote the code. Chandan Rajendra <chan...@linux.ibm.com> writes: > When executing fstests' generic/026 test, I hit the following call trace, > > [ 417.061038] BUG: Unable to handle kernel data access at 0xc00000062ac40000 > [ 417.062172] Faulting instruction address: 0xc000000000092240 > [ 417.062242] Oops: Kernel access of bad area, sig: 11 [#1] > [ 417.062299] LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries > [ 417.062366] Modules linked in: > [ 417.062401] CPU: 0 PID: 27828 Comm: chacl Not tainted > 5.0.0-rc2-next-20190115-00001-g6de6dba64dda #1 > [ 417.062495] NIP: c000000000092240 LR: c00000000066a55c CTR: > 0000000000000000 > [ 417.062567] REGS: c00000062c0c3430 TRAP: 0300 Not tainted > (5.0.0-rc2-next-20190115-00001-g6de6dba64dda) > [ 417.062660] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: > 44000842 XER: 20000000 > [ 417.062750] CFAR: 00007fff7f3108ac DAR: c00000062ac40000 DSISR: 40000000 > IRQMASK: 0 > GPR00: 0000000000000000 c00000062c0c36c0 c0000000017f4c00 > c00000000121a660 > GPR04: c00000062ac3fff9 0000000000000004 0000000000000020 > 00000000275b19c4 > GPR08: 000000000000000c 46494c4500000000 5347495f41434c5f > c0000000026073a0 > GPR12: 0000000000000000 c0000000027a0000 0000000000000000 > 0000000000000000 > GPR16: 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > GPR20: c00000062ea70020 c00000062c0c38d0 0000000000000002 > 0000000000000002 > GPR24: c00000062ac3ffe8 00000000275b19c4 0000000000000001 > c00000062ac30000 > GPR28: c00000062c0c38d0 c00000062ac30050 c00000062ac30058 > 0000000000000000 > [ 417.063563] NIP [c000000000092240] memcmp+0x120/0x690 > [ 417.063635] LR [c00000000066a55c] xfs_attr3_leaf_lookup_int+0x53c/0x5b0 > [ 417.063709] Call Trace: > [ 417.063744] [c00000062c0c36c0] [c00000000066a098] > xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable) > [ 417.063851] [c00000062c0c3760] [c000000000693f8c] > xfs_da3_node_lookup_int+0x32c/0x5a0 > [ 417.063944] [c00000062c0c3820] [c0000000006634a0] > xfs_attr_node_addname+0x170/0x6b0 > [ 417.064034] [c00000062c0c38b0] [c000000000664ffc] xfs_attr_set+0x2ac/0x340 > [ 417.064118] [c00000062c0c39a0] [c000000000758d40] __xfs_set_acl+0xf0/0x230 > [ 417.064190] [c00000062c0c3a00] [c000000000758f50] xfs_set_acl+0xd0/0x160 > [ 417.064268] [c00000062c0c3aa0] [c0000000004b69b0] set_posix_acl+0xc0/0x130 > [ 417.064339] [c00000062c0c3ae0] [c0000000004b6a88] > posix_acl_xattr_set+0x68/0x110 > [ 417.064412] [c00000062c0c3b20] [c0000000004532d4] __vfs_setxattr+0xa4/0x110 > [ 417.064485] [c00000062c0c3b80] [c000000000454c2c] > __vfs_setxattr_noperm+0xac/0x240 > [ 417.064566] [c00000062c0c3bd0] [c000000000454ee8] vfs_setxattr+0x128/0x130 > [ 417.064638] [c00000062c0c3c30] [c000000000455138] setxattr+0x248/0x600 > [ 417.064710] [c00000062c0c3d90] [c000000000455738] path_setxattr+0x108/0x120 > [ 417.064785] [c00000062c0c3e00] [c000000000455778] sys_setxattr+0x28/0x40 > [ 417.064858] [c00000062c0c3e20] [c00000000000bae4] system_call+0x5c/0x70 > [ 417.064930] Instruction dump: > [ 417.064964] 7d201c28 7d402428 7c295040 38630008 38840008 408201f0 4200ffe8 > 2c050000 > [ 417.065051] 4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 > 7d4a3436 7c295040 > [ 417.065150] ---[ end trace 0d060411b5e3741b ]--- > > > Both the memory locations passed to memcmp() had "SGI_ACL_FILE" and len > argument of memcmp() was set to 12. s1 argument of memcmp() had the value > 0x00000000f4af0485, while s2 argument had the value 0x00000000ce9e316f. > > The following is the code path within memcmp() that gets executed for the > above mentioned values, > > - Since len (i.e. 12) is greater than 7, we branch to .Lno_short. > - We then prefetch the contents of r3 & r4 and branch to > .Ldiffoffset_8bytes_make_align_start. > - Under .Ldiffoffset_novmx_cmp, Since r3 is unaligned we end up comparing > "SGI" part of the string. r3's value is then aligned. r4's value is > incremented by 3. For comparing the remaining 9 bytes, we jump to > .Lcmp_lt32bytes. > - Here, 8 bytes of the remaining 9 bytes are compared and execution moves to > .Lcmp_rest_lt8bytes. > - Here we execute "LD rB,0,r4". In the case of this bug, r4 has an unaligned > value and hence ends up accessing the "next" double word. The "next" double > word happens to occur after the last page mapped into the kernel's address > space and hence this leads to the previously listed oops.
Thanks for the analysis. This is just a bug, we can't read past the end of the source or dest. We have a selftest for memcmp, but clearly it doesn't exercise this case. Here's a patch to try and trip it: diff --git a/tools/testing/selftests/powerpc/stringloops/memcmp.c b/tools/testing/selftests/powerpc/stringloops/memcmp.c index b1fa7546957f..edca3abb6ecf 100644 --- a/tools/testing/selftests/powerpc/stringloops/memcmp.c +++ b/tools/testing/selftests/powerpc/stringloops/memcmp.c @@ -2,6 +2,7 @@ #include <malloc.h> #include <stdlib.h> #include <string.h> +#include <sys/mman.h> #include <time.h> #include "utils.h" @@ -68,25 +69,24 @@ static void test_one(char *s1, char *s2, unsigned long max_offset, static int testcase(bool islarge) { - char *s1; - char *s2; - unsigned long i; - - unsigned long comp_size = (islarge ? LARGE_SIZE : SIZE); - unsigned long alloc_size = comp_size + MAX_OFFSET_DIFF_S1_S2; - int iterations = islarge ? LARGE_ITERATIONS : ITERATIONS; - - s1 = memalign(128, alloc_size); - if (!s1) { - perror("memalign"); - exit(1); - } + unsigned long i, comp_size, alloc_size, map_chunk; + char *p, *s1, *s2; + int iterations; - s2 = memalign(128, alloc_size); - if (!s2) { - perror("memalign"); - exit(1); - } + comp_size = (islarge ? LARGE_SIZE : SIZE); + alloc_size = comp_size + MAX_OFFSET_DIFF_S1_S2; + iterations = islarge ? LARGE_ITERATIONS : ITERATIONS; + + map_chunk = 64 * 1024; + p = mmap(NULL, 4 * map_chunk, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + FAIL_IF(p == MAP_FAILED); + + s1 = p + map_chunk - alloc_size; + s2 = p + 3 * map_chunk - alloc_size; + + munmap(p + map_chunk, map_chunk); + munmap(p + 3 * map_chunk, map_chunk); srandom(time(0)); cheers