On 17/04/17 07:58 AM, Marek Olšák wrote: > On Fri, Apr 14, 2017 at 12:14 PM, Michel Dänzer <mic...@daenzer.net> wrote: >> On 04/04/17 05:11 AM, Marek Olšák wrote: >>> On Fri, Mar 31, 2017 at 5:24 AM, Michel Dänzer <mic...@daenzer.net> wrote: >>>> On 30/03/17 07:03 PM, Michel Dänzer wrote: >>>>> On 25/03/17 01:33 AM, Marek Olšák wrote: >>>>>> Hi, >>>>>> >>>>>> I'm sharing this idea here, because it's something that has been >>>>>> decreasing our performance a lot recently, for example: >>>>>> http://openbenchmarking.org/prospect/1703011-RI-RADEONDIR06/7b7668cfc109d1c3dc27e871c8aea71ca13f23fa >>>>> >>>>> The attached proof-of-concept patch (on top of Christian's "CPU mapping >>>>> of split VRAM buffers" series, ported from radeon) results in 145.05 fps >>>>> on my Tonga. >>>> >>>> I get the same result without my or Christian's patches though, with >>>> 4.11 based DRM or amd-staging-4.9. So I guess I just can't reproduce the >>>> problem with this test. Are there any other tests for it? >>> >>> It's random. Sometimes the benchmark runs OK, other times it's slow. >>> You can easily see the difference but observing how smooth it is. The >>> visible VRAM evictions result in constant 100-200ms stalls but not >>> every frame, which feels like the frame rate is much lower than it >>> actually is. >>> >>> Make sure your graphics details are maxed out. The best score I can >>> get with my rig is 70 fps. (Fiji & Core i5 3570) >> >> I'm getting around 53-54 fps at Ultra with Tonga, both with Mesa 13.0.6 >> and Git. >> >> Have you tried if Christian's patches for CPU access to split VRAM >> buffers help? I can imagine that forcing contiguous VRAM buffers for CPU >> access could cause lots of other BOs to be unnecessarily evicted from >> VRAM, if at least one of their fragments happens to be in the CPU >> visible part of VRAM. > > I've finally tested latest amd-staging-4.9 and I'm very pleased. For > the first time, the Deus Ex benchmark has almost no hiccups. I've > never seen it so smooth. At one point, the MB/s BO move rate increase > to 200MB/s, stayed there for a couple of seconds, and then it dropped > to 0 again. The frame rate was OK-ish, so I guess the moves didn't > happen all at once. I also tested DiRT Rally and I haven't been able > to reproduce the low FPS with the consistently-high BO move rate that > I saw several months ago. > > We could do some move throttling there for sure, but it's much better > than it ever was.
That's great to hear. If you get a chance, it would be interesting if the attached updated patch improves things even more for you. (The patch I attached previously couldn't work as intended, this one at least might :) -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 7f9710502bcc..78362e09cc51 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -205,7 +205,44 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo, case TTM_PL_VRAM: if (adev->mman.buffer_funcs_ring->ready == false) { amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_CPU); + } else if (adev->mc.visible_vram_size < adev->mc.real_vram_size) { + unsigned fpfn = adev->mc.visible_vram_size >> PAGE_SHIFT; + int i; + + if (bo->mem.start >= fpfn) { + struct drm_mm_node *node = bo->mem.mm_node; + unsigned long pages_left; + + for (pages_left = bo->mem.num_pages; pages_left; + pages_left -= node->size, node++) { + if (node->start < fpfn) + break; + } + + if (!pages_left) + goto gtt; + } + + /* Try evicting to the CPU inaccessible part of VRAM + * first, but only set GTT as busy placement, so this + * BO will be evicted to GTT rather than causing other + * BOs to be evicted from VRAM + */ + amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM | + AMDGPU_GEM_DOMAIN_GTT); + abo->placement.num_busy_placement = 0; + for (i = 0; i < abo->placement.num_placement; i++) { + if (abo->placements[i].flags & TTM_PL_FLAG_VRAM) { + if (abo->placements[i].fpfn < fpfn) + abo->placements[i].fpfn = fpfn; + } else { + abo->placement.busy_placement = + &abo->placements[i]; + abo->placement.num_busy_placement = 1; + } + } } else { +gtt: amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT); for (i = 0; i < abo->placement.num_placement; ++i) { if (!(abo->placements[i].flags &
_______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx