On Mon, Nov 5, 2018 at 7:43 AM Liang, Kan <kan.li...@linux.intel.com> wrote: > > > > On 11/5/2018 5:59 AM, Stephane Eranian wrote: > > Hi Kan, > > > > I built a small test case for you to demonstrate the issue for code and > > data. > > Compile the test program and then do: > > For text: > > $ perf record ./mmap > > $ perf report -D | fgrep MMAP2 > > > > The test program mmaps 2 pages, unmaps the second, and remap 1 page > > over the freed space. > > If you look at the MMAP2 record, you will not be able to reconstruct > > what happened and perf will > > get confused should it try to symbolize from the address range > > > > With Text: > > PERF_RECORD_MMAP2 5937/5937: [0x400000(0x1000) @ 0 08:01 400938 > > 824817672]: r-xp /home/eranian/mmap > > PERF_RECORD_MMAP2 5937/5937: [0x7f7c01019000(0x2000) @ 0x7f7c01019000 > > 00:00 0 0]: rwxp //anon > > PERF_RECORD_MMAP2 5937/5937: [0x7f7c01019000(0x2000) @ 0x7f7c01019000 > > 00:00 0 0]: rwxp //anon > > > > ^^^^^^^^^^^^^^^^^^^^^^^^ captures the whole VMA but not the mapping > > change in user space > > > > For data: > > $ perf record -d ./mmap > > $ perf report -D | fgrep MMAP2 > > With data: > > PERF_RECORD_MMAP2 6430/6430: [0x400000(0x1000) @ 0 08:01 400938 > > 3278843184]: r-xp /home/eranian/mmap > > PERF_RECORD_MMAP2 6430/6430: [0x7f4aa704b000(0x2000) @ 0x7f4aa704b000 > > 00:00 0 0]: rw-p //anon > > PERF_RECORD_MMAP2 6430/6430: [0x7f4aa704b000(0x2000) @ 0x7f4aa704b000 > > 00:00 0 0]: rw-p //anon > > > > Same test case with data. > > Perf will think the entire 2 pages have been replaced when in fact > > only the second has. > > I believe the problem is likely to impact data and jitted code cache > > > > #include <sys/types.h> > > #include <stdio.h> > > #include <unistd.h> > > #include <sys/mman.h> > > #include <err.h> > > #include <getopt.h> > > > > int main(int argc, char **argv) > > { > > void *addr1, *addr2; > > size_t pgsz = sysconf(_SC_PAGESIZE); > > int n = 2; > > int ret; > > int c, mode = 0; > > > > while ((c = getopt(argc, argv, "hd")) != -1) { > > switch (c) { > > case 'h': > > printf("[-h]\tget this help\n"); > > printf("[-d]\tuse data mmaps (no PROT_EXEC)\n"); > > return 0; > > case 'd': > > mode = PROT_EXEC; > > break; > > default: > > errx(1, "unknown option"); > > } > > } > > /* default to data */ > > if (mode == 0) > > mode = PROT_WRITE; > > > > /* > > * mmap 2 contiugous pages > > */ > > addr1 = mmap(NULL, n * pgsz, PROT_READ| mode, MAP_ANONYMOUS|MAP_PRIVATE, > > -1, 0); > > if (addr1 == (void *)MAP_FAILED) > > err(1, "mmap 1 failed"); > > > > printf("addr1=[%p : %p]\n", addr1, addr1 + n * pgsz); > > > > /* > > * unmap only the second page > > */ > > ret = munmap(addr1 + pgsz, pgsz); > > if (ret == -1) > > err(1, "munmp failed"); > > > > /* > > * mmap 1 page at the location of the unmap page (should reuse virtual space) > > * This creates a continuous region built from two mmaps and > > potentially two different sources > > * especially with jitted runtimes > > */ > > The two mmaps are both anon. As my understanding, we cannot symbolize > from the anonymous address, can we? Can't we build the same test case using an actual file mapping (both mmap from the same file)?
> If we cannot, why we have to distinguish with them? I think we do not > need to know their sources for symbolization. > > As my understanding, only --jit can inject MMAP event, which tag an > anon. Perf can symbolize the address after that. Then the unmap is needed. > Yes, perf inject --jit injects timestamped MMAP2 records to cover the jitted regions which helps symbolize anons. > Thanks, > Kan > > addr2 = mmap(addr1 + pgsz, 1 * pgsz, PROT_READ|PROT_WRITE | mode, > > MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); > > > > printf("addr2=%p\n", addr2); > > > > if (addr2 == (void *)MAP_FAILED) > > err(1, "mmap 2 failed"); > > if (addr2 != (addr1 + pgsz)) > > errx(1, "wrong mmap2 address"); > > > > sleep(1); > > > > return 0; > > } > > > > On Thu, Nov 1, 2018 at 7:10 AM Liang, Kan <kan.li...@linux.intel.com> wrote: > >> > >> > >> > >> On 10/24/2018 3:30 PM, Stephane Eranian wrote: > >>> The need for this new record type extends beyond physical address > >>> conversions > >>> and PEBS. A long while ago, someone reported issues with symbolization > >>> related > >>> to perf lacking munmap tracking. It had to do with vma merging. I think > >>> the > >>> sequence of mmaps was as follows in the problematic case: > >>> 1. addr1 = mmap(8192); > >>> 2. munmap(addr1 + 4096, 4096) > >>> 3. addr2 = mmap(addr1+4096, 4096) > >>> > >>> If successful, that yields addr2 = addr1 + 4096 (could also get the > >>> same without forcing the address). > >>> > >>> In that case, if I recall correctly, the vma for 1st mapping (now at > >>> 4k) and that of the 2nd mapping (4k) > >>> get merged into a single 8k vma and this is what perf_events will > >>> record for PERF_RECORD_MMAP. > >>> On the perf tool side, it is assumed that if two timestamped mappings > >>> overlap then, the latter overrides > >>> the former. In this case, perf would loose the mapping of the first > >>> 4kb and assume all symbols comes from > >>> 2nd mapping. Hopefully I got the scenario right. If so, then you'd > >>> need PERF_RECORD_UNMAP to > >>> disambiguate assuming the perf tool is modified accordingly. > >>> > >> > >> Hi Stephane and Peter, > >> > >> I went through the link(https://lkml.org/lkml/2017/1/27/452). I'm trying > >> to understand the problematic case. > >> > >> It looks like the issue can only be triggered by perf inject --jit. > >> Because it can inject extra MMAP events. > >> As my understanding, Linux kernel only try to merge VMAs if they are > >> both from anon or they are both from the same file. --jit breaks the > >> rule, and makes the merged VMA partly from anon, partly from file. > >> Now, there is a new MMAP event which range covers the modified VMA. > >> Without the help of MUNMAP event, perf tool have no idea if the new one > >> is a newly merged VMA (modified VMA + a new VMA) or a brand new VMA. > >> Current code just simply overwrite the modified VMAs. The VMA > >> information which --jit injected may be lost. The symbolization may be > >> lost as well. > >> > >> Except --jit, the VMAs information should be consistent between kernel > >> and perf tools. We shouldn't observe the problem. MUNMAP event is not > >> needed. > >> > >> Is my understanding correct? > >> > >> Do you have a test case for the problem? > >> > >> Thanks, > >> Kan