(Dropping Mike Kravetz as CC since he has retired and his email is no
longer valid, adding Muchun since he's the current hugetlb maintainer,
as well as linux-trace-kernel)

On 22 Apr 11:39, David Hildenbrand wrote:
>
> On 19.04.24 20:37, Guillaume Morin wrote:
> > libhugetlbfs, the Intel iodlr code both allow to remap .text onto a
> > hugetlb private mapping. It's also pretty easy to do it manually.
> > One drawback of using this functionality is the lack of support for
> > uprobes (NOTE uprobe ignores shareable vmas)
> > 
> > This patch adds support for private hugetlb mappings.  It does require 
> > exposing
> > some hugetlbfs innards and relies on copy_user_large_folio which is only
> > available when CONFIG_HUGETLBFS is used so I had to use an ugly #ifdef
> > 
> > If there is some interest in applying this patch in some form or
> > another, I am open to any refactoring suggestions (esp getting rid the
> > #ifdef in uprobes.c) . I tried to limit the
> > amount of branching.
> 
> All that hugetlb special casing .... oh my. What's the benefit why we should
> be interested in making that code less clean -- to phrase it in a nice way
> ;) ?

I do appreciate the nice phrasing. Believe me, I did try to limit the
special casing to a minimum :-).

Outside of __replace_page, I added only 3-ish branches so I do not think
it's *too* bad. The uprobe code is using PAGE_{SHIFT,MASK} quite liberally so I
had to add calls to retrieve these for the hugetlb vmas.

__replace_page has a lot of special casing. I certainly agree (and
unfortunately for me it's at the beginning of the patch :)).  It's doing
something pretty uncommon outside of the mm code so it has to make a
bunch of specific hugetlb calls. I am not quite sure how to improve it
but if you have suggestions, I'd be happy to refactor.

The benefit - to me - is very clear. People do use hugetlb mappings to
run code in production environments. The perf benefits are there for some
workloads. Intel has published a whitepaper about it etc.
Uprobes are a very good tool to do live tracing. If you can restart the
process and reproduce, you should be able to disable hugetlb remapping
but if you need to look at a live process, there are not many options.
Not being able to use uprobes is crippling.

> Yes, libhugetlbfs exists. But why do we have to support uprobes with it?
> Nobody cared until now, why care now?

I think you could ask the same question for every new feature patch :)

Since the removal a few releases ago of the __morecore() hook in glibc,
the main feature of libhugetlbfs is ELF segments remapping. I think
there are definitely a lot of users that simply deal with this
unnecessary limitation.

I am certainly not shoving this patch through anyone's throat if there
is no interest. But we definitely find it a very useful feature ...

Guillaume.

-- 
Guillaume Morin <guilla...@morinfr.org>

Reply via email to