Eric Anholt wrote: > >> c) User-space bo-caching and reuse. >> d) User-space buffer pools. >> >> TG is heading down the d) path since it also fixes the texture >> granularity problem. >> > > There's no texture granularity problem on Intel, right, given that we > have a fixed mipmap layout? Or have you seen real applications with > piles of very small textures to the point where it's an issue? I'm > concerned over our code growing in complexity to handle issues that > aren't actually issues for our hardware. > Yes, we've seen such cases, but they've been worked around in the app. > c) is a very attractive solution as it's a 100 line diff to what we have > today (people.freedesktop.org:~anholt/drm intel-buffer-reuse). Looks > like something broke the 20% performance win I saw when I last tested > this a few weeks ago. > > But I'd rather just fix the the performance bugs we have in the current > DRM implementation so that we potentially don't even need to bother with > these hacks. So, !highmem, page-by-page allocation > > >> 2) Relocation application. KeithPs presumed_offset stuff has to a great >> extent fixed this problem. I think the kmap_atomic_prot_pfn() stuff just >> added will take care of the rest, and I hope the mm kernel guys will >> understand the problem and accept the kmap_atomic_prot_pfn() in. I'm >> working on a patch that will do post-validation only relocations this way. >> > > Right now kernel relocation handling is up to 9% of the profile. Unless > some really weird inlining is going on, the map calls aren't showing up, > which isn't surprising since we never actually do any mapping today (all > of our memory is direct-mapped lowmem. I really want to fix that). > Yes, you're probably always using the fixed kernel map. The kmap_atomic_prot_pfn() is important for post relocations that are done through the GTT, where the ioremap() otherwise would be nasty. > > Going from CACHED_MAPPED back to uncached even with buffer reuse is a > 22% performance drop (but at least it's not the 79% hit to the face you > get without buffer reuse). > > Hmm, this sounds odd. That simply means you must still be doing a lot of buffer binding / unbinding? Relocations using master drm will also be slow of course. > So, I want DRM_BO_FLAG_CACHED_MAPPED to be the default for our DRM when > you don't ask for cached buffers, determined by the driver flagging it > as doable on this chipset. > I see two problems with this: One is cosmetic in that DRM_BO_FLAG_CACHED_MAPPED doesn't have the same semantics as !DRM_BO_FLAG_CACHED. You can't use it for user-space buffer pools. The other one is that TG will be needing the functionality of !DRM_BO_FLAG_CACHED, so we need to provide a way to have that. I guess you will be needing it as well for things like scanout buffers?
/Thomas ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel