On Mon, Jun 08, 2026 at 04:37:59PM -0400, Zi Yan wrote:
> On 8 Jun 2026, at 16:25, Gregory Price wrote:
>
> > On Mon, Jun 08, 2026 at 03:52:20PM -0400, Zi Yan wrote:
> >> On 8 Jun 2026, at 15:43, Gregory Price wrote:
> >>
> >>> On Mon, Jun 08, 2026 at 08:39:10PM +0200, David Hildenbrand (Arm) wrote:
> >>>> On 6/8/26 17:27, Zi Yan wrote:
> >>>>> On 8 Jun 2026, at 7:08, David Hildenbrand (Arm) wrote:
> >>>>>
> >>>>> Or should we defer zeroing after a page is returned from allocator? So
> >>>>> that
> >>>>> user_addr does not need to be passed through irrelevant allocation APIs.
> >>>>> Something like:
> >>>>>
> >>>>> alloc_page_wrapper(gfp, order, user_addr)
> >>>>> {
> >>>>> page = alloc_pages();
> >>>>> if (gfp & __GFP_ZERO)
> >>>>> clear_page(page);
> >>>>> }
> >>>>>
> >>>>
> >>>> Not really sure what's best here. I think we'd want to limit the lifting
> >>>> to some
> >>>> internal API, so it cannot easily be messed up by random kernel code
> >>>> calling
> >>>> into the wrong API and not getting pages cleared.
> >>>>
> >>>
> >>> We're a bit in circles on this. We discussed explicit interfaces a few
> >>> months back and the trade off was:
> >>>
> >>> a) add user_addr to the existing API and cause churn
> >>>
> >>> or
> >>>
> >>> b) add special interface like above
> >>> increase the buddy surface
> >>> leaves open the ability for users to get it wrong easily
> >>>
> >>> If we forget VMs for a moment and break this step out separately, the
> >>> core question is whether page_alloc.c is the right place to be calling
> >>> the folio_user_zero() or whatever it is.
> >>
> >> page_alloc.c calling folio_user_zero() is fine, but my question is
> >> whether we should do the zeroing inside post_alloc_hook(), part of
> >> allocation.
> >>
> >> What I propose is to lift __GFP_ZERO up as much as possible,
> >> so that most of allocation code does not need to care about it.
> >> We do the zeroing right before the page is returned to callers.
> >>
> >
> > essentially we end up with something like
> >
> > alloc_frozen_...(..., gfp)
> > {
> > folio = whatever(..., gfp);
> > if (gfp & __GFP_ZERO)
> > folio_zero(folio, -1); /* don't do cache flush part */
> > }
> >
> > alloc_frozen_user_...(..., gfp, user_addr)
> > {
> > folio = whatever(..., gfp);
> > if (gfp & __GFP_ZERO)
> > folio_zero(folio, user_addr); /* do cache flush part */
> > }
> >
> > The downside of this is obvious: it's easy for developers to get this
> > wrong and call the non-user interface for user-bound allocations and
> > miss the cache flush (that is only needed on some archs).
> >
> > Not saying that's a deal breaker, but it's something to chew on.
>
> I agree that misuse can cause trouble. But if we do the churn approach,
> what prevents developer from doing alloc_frozen(..., user_addr = -1)
> and using the returned page for userspace? It is possible the allocated
> page can be exported to userspace later.
>
> BTW, that cache flush thing is fragile even today,
Probably arch dependent. On arm32, I think if you miss the flush, then
PG_dcache_clean will be clear and then you get a perf hit but
it's still correct. Didn't check others.
> you probably can
> do alloc_page() + vm_insert() to get a page without doing proper flush
> and export it to userspace. There seems to be no mechanism to
> prevent that.
>
> Best Regards,
> Yan, Zi
Because maybe you want to expose data to userspace?
--
MST