> When fork support is enabled in libibverbs, madvise() is called for every > memory page that is registered as a memory region. Memory ranges that > are passed to madvise() must be page aligned and the size must be a > multiple of the page size. libibverbs uses sysconf(_SC_PAGESIZE) to find > out the system page size and rounds all ranges passed to reg_mr() according > to this page size. When memory from libhugetlbfs is passed to reg_mr(), this > does not work as the page size for this memory range might be different > (e.g. 16Mb). So libibverbs would have to use the huge page size to > calculate a page aligned range for madvise.
Yes, Alex Vainman reaised this same issue a while ago. > The patch below demonstrates a possible solution for this. It parses the > /proc/PID/maps file when registering a memory region and decides if the > memory that is to be registered is part of a libhugetlbfs range or not. If > so, > a page size of 16Mb is used to align the memory range passed to madvise(). > > We see two problems with this: it is not a very elegant solution to parse the > procfs file and the 16Mb are hardcoded currently. The latter point could be > solved by calling gethugepagesize() from libhugetlbfs, which would add a new > dependency to libibverbs. I think that we cannot assume huge pages only come from libhugetlbfs -- we should support an application directly enabling huge pages (possibly via another library too, so we can't assume that an application knows the page size for a memory range it is about to register). And also the 16 MB page size constant is of course not feasible -- with all due respect, the x86 page size of 2 MB is much more likely in practice :) (Although perhaps the much slower PowerPC TLB refill makes users more likely to try and use hugetlb pages ;) Alex suggested parsing files in the same way as libhugetlbfs does to get the page size, and that seems to be the best solution, since I don't think the libhugetlbfs license is compatible with the BSD license for libibverbs. But your trick of using /proc/*/maps looks nice. Does that only work for libhugetlbfs or can we recognize direct mmap of hugetlb pages? - R. -- Roland Dreier <rola...@cisco.com> || For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html _______________________________________________ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg