[ https://issues.apache.org/jira/browse/TS-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan M. Carroll updated TS-4027: -------------------------------- Fix Version/s: (was: 6.1.0) 6.2.0 > incorrect usage of mmap > ----------------------- > > Key: TS-4027 > URL: https://issues.apache.org/jira/browse/TS-4027 > Project: Traffic Server > Issue Type: Bug > Components: Core > Affects Versions: 6.0.0 > Reporter: Jason Kenny > Fix For: 6.2.0 > > > The specific type of the crash caused by this failure (whether it is the > SCT_SetReg assertion, or Unexpected memory deallocation error) is pretty > random. > This is because the crash is caused by the application’s problematic/buggy > behavior in a way I’ll describe below. > This is a trace of the syscalls invoked by traffic_server as generated with > strace (running without Pin) - I highlighted the important syscalls: > open("/tmp/yts/var/trafficserver/host.db", O_RDWR|O_CREAT, 0644) = 118 > fstat(118, {st_mode=S_IFREG|0644, st_size=25935872, ...}) = 0 > open("/dev/zero", O_RDONLY) = 119 > 1. mmap(NULL, 25935872, PROT_READ, MAP_SHARED|MAP_NORESERVE, 119, 0) = > 0x7fffef935000 > 2. munmap(0x7fffef935000, 25935872) = 0 > 3. mmap(0x7fffef935000, 966656, PROT_READ|PROT_WRITE, > MAP_SHARED|MAP_FIXED|MAP_NORESERVE, 118, 0) = 0x7fffef935000 > mmap(0x7fffefa21000, 7675904, PROT_READ|PROT_WRITE, > MAP_SHARED|MAP_FIXED|MAP_NORESERVE, 118, 0xec000) = 0x7fffefa21000 > mmap(0x7ffff0173000, 17285120, PROT_READ|PROT_WRITE, > MAP_SHARED|MAP_FIXED|MAP_NORESERVE, 118, 0x83e000) = 0x7ffff0173000 > mmap(0x7ffff11ef000, 8192, PROT_READ|PROT_WRITE, > MAP_SHARED|MAP_FIXED|MAP_NORESERVE, 118, 0x18ba000) = 0x7ffff11ef000 > close(119) = 0 > close(118) = 0 > 1. You can see that the application is first calling mmap with “addr” > argument equal to NULL in order to get a memory region that covers the whole > region requested for mapping of the file host.db (25935872 bytes). > 2. The application unmaps the allocated region (at address > 0x7fffef935000), but remembers this region as a “free” region. > 3. Then, the application tries to mmap each region of the file to > addresses inside the region (from step 1) that it considers as “free”. These > mmaps are called with the MAP_FIXED flag – meaning that if there is already a > memory mapped in the requested region then the kernel should still map the > requested memory in a way that the already mapped region will be discarded > (zeroed). > > As you probably guess, Between step 2 and 3 Pin is asking the OS to allocate > memory for its own purpose and get some memory inside the unmapped region > that the application considers as “free”. > This will cause the mapping in step 3 to overwrite Pin’s memory and corrupt > it – leading to this crash. > This is a bug in the application (traffic_server) that needs to be fixed. > In a multi-threaded environment (traffic_server is multi-threaded) another > thread can request memory mapping (by mmap) between step 2 and 3, and get a > memory region that intersects with the problematic region. > This allocated memory will eventually be corrupted. > You don’t need Pin and dynamic instrumentation environment in order to > reproduce this bug! > > BTW, the application call-stack of this crash, as reported by Inspector is: > <!-- [8]: pc = 0x00000000006df235, sp = 0x00007fffffffa560, min_sp = > 0x00007fffffffa560, _ZN14MultiCacheBase11mmap_regionEiPiPcRmbi, :0 --> > <!-- [7]: pc = 0x00000000006dfc93, sp = 0x00007fffffffa610, min_sp = > 0x00007fffffffa610, _ZN14MultiCacheBase9mmap_dataEbb, :0 --> > <!-- [6]: pc = 0x00000000006e0ec3, sp = 0x00007fffffffbb80, min_sp = > 0x00007fffffffbb80, _ZN14MultiCacheBase4openEP5StorePKcPcibbb, :0 --> > <!-- [5]: pc = 0x00000000006ccfd5, sp = 0x00007fffffffcca0, min_sp = > 0x00007fffffffcca0, _ZN11HostDBCache5startEi, :0 --> > <!-- [4]: pc = 0x00000000006cd31e, sp = 0x00007fffffffdd60, min_sp = > 0x00007fffffffdd60, _ZN15HostDBProcessor5startEim, :0 --> > <!-- [3]: pc = 0x000000000053c771, sp = 0x00007fffffffddd0, min_sp = > 0x00007fffffffddd0, main, :0 --> > <!-- [2]: pc = 0x0000003acbc1ed19, sp = 0x00007fffffffdf30, min_sp = > 0x00007fffffffdf30, __libc_start_main, :0 --> > <!-- [1]: pc = 0x00000000004ee824, sp = 0x00007fffffffdff0, min_sp = > 0x00007fffffffdff0, _start, :0 --> > -- This message was sent by Atlassian JIRA (v6.3.4#6332)