On Fri, Dec 19, 2025 at 7:19 PM Maxime Ripard <[email protected]> wrote: > > Hi, > > On Tue, Dec 16, 2025 at 11:06:59AM +0900, T.J. Mercier wrote: > > On Mon, Dec 15, 2025 at 7:51 PM Maxime Ripard <[email protected]> wrote: > > > On Fri, Dec 12, 2025 at 08:25:19AM +0900, T.J. Mercier wrote: > > > > On Fri, Dec 12, 2025 at 4:31 AM Eric Chanudet <[email protected]> > > > > wrote: > > > > > > > > > > The system dma-buf heap lets userspace allocate buffers from the page > > > > > allocator. However, these allocations are not accounted for in memcg, > > > > > allowing processes to escape limits that may be configured. > > > > > > > > > > Pass the __GFP_ACCOUNT for our allocations to account them into memcg. > > > > > > > > We had a discussion just last night in the MM track at LPC about how > > > > shared memory accounted in memcg is pretty broken. Without a way to > > > > identify (and possibly transfer) ownership of a shared buffer, this > > > > makes the accounting of shared memory, and zombie memcg problems > > > > worse. :\ > > > > > > Are there notes or a report from that discussion anywhere? > > > > The LPC vids haven't been clipped yet, and actually I can't even find > > the recorded full live stream from Hall A2 on the first day. So I > > don't think there's anything to look at, but I bet there's probably > > nothing there you don't already know. > > Ack, thanks for looking at it still :) > > > > The way I see it, the dma-buf heaps *trivial* case is non-existent at > > > the moment and that's definitely broken. Any application can bypass its > > > cgroups limits trivially, and that's a pretty big hole in the system. > > > > Agree, but if we only charge the first allocator then limits can still > > easily be bypassed assuming an app can cause an allocation outside of > > its cgroup tree. > > > > I'm not sure using static memcg limits where a significant portion of > > the memory can be shared is really feasible. Even with just pagecache > > being charged to memcgs, we're having trouble defining a static memcg > > limit that is really useful since it has to be high enough to > > accomodate occasional spikes due to shared memory that might or might > > not be charged (since it can only be charged to one memcg - it may be > > spread around or it may all get charged to one memcg). So excessive > > anonymous use has to get really bad before it gets punished. > > > > What I've been hearing lately is that folks are polling memory.stat or > > PSI or other metrics and using that to take actions (memory.reclaim / > > killing / adjust memory.high) at runtime rather than relying on > > memory.high/max behavior with a static limit. > > But that's only side effects of a buffer being shared, right? (which, > for a buffer sharing mechanism is still pretty important, but still) > > > > The shared ownership is indeed broken, but it's not more or less broken > > > than, say, memfd + udmabuf, and I'm sure plenty of others. > > > > One thing that's worse about system heap buffers is that unlike memfd > > the memory isn't reclaimable. So without killing all users there's > > currently no way to deal with the zombie issue. Harry's proposing > > reparenting, but I don't think our current interfaces support that > > because we'd have to mess with the page structs behind system heap > > dmabufs to change the memcg during reparenting. > > > > Ah... but udmabuf pins the memfd pages, so you're right that memfd + > > udmabuf isn't worse. > > > > > So we really improve the common case, but only make the "advanced" > > > slightly more broken than it already is. > > > > > > Would you disagree? > > > > I think memcg limits in this case just wouldn't be usable because of > > what I mentioned above. In our common case the allocator is in a > > different cgroup tree than the real users of the buffer. > > So, my issue with this is that we want to fix not only dma-buf itself, > but every device buffer allocation mechanism, so also v4l2, drm, etc. > > So we'll need a lot of infrastructure and rework outside of dma-buf to > get there, and figuring out how to solve the shared buffer accounting is > indeed one of them, but was so far considered kind the thing to do last > last time we discussed. > > What I get from that discussion is that we now consider it a > prerequisite, and given how that topic has been advancing so far, one > that would take a couple of years at best to materialize into something > useful and upstream. > > Thus, it blocks all the work around it for years. > > Would you be open to merging patches that work on it but only enabled > through a kernel parameter for example (and possibly taint the kernel?)? > That would allow to work towards that goal while not being blocked by > the shared buffer accounting, and not affecting the general case either. > > Maxime
Hi Maxime, A kernel param or a CONFIG sound like a good compromise to allow work to progress. I'd be happy to add my R-B to that.
