On 17/04/17 07:58 AM, Marek Olšák wrote:
> On Fri, Apr 14, 2017 at 12:14 PM, Michel Dänzer <mic...@daenzer.net> wrote:
>> On 04/04/17 05:11 AM, Marek Olšák wrote:
>>> On Fri, Mar 31, 2017 at 5:24 AM, Michel Dänzer <mic...@daenzer.net> wrote:
>>>> On 30/03/17 07:03 PM, Michel Dänzer wrote:
>>>>> On 25/03/17 01:33 AM, Marek Olšák wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm sharing this idea here, because it's something that has been
>>>>>> decreasing our performance a lot recently, for example:
>>>>>> http://openbenchmarking.org/prospect/1703011-RI-RADEONDIR06/7b7668cfc109d1c3dc27e871c8aea71ca13f23fa
>>>>>
>>>>> The attached proof-of-concept patch (on top of Christian's "CPU mapping
>>>>> of split VRAM buffers" series, ported from radeon) results in 145.05 fps
>>>>> on my Tonga.
>>>>
>>>> I get the same result without my or Christian's patches though, with
>>>> 4.11 based DRM or amd-staging-4.9. So I guess I just can't reproduce the
>>>> problem with this test. Are there any other tests for it?
>>>
>>> It's random. Sometimes the benchmark runs OK, other times it's slow.
>>> You can easily see the difference but observing how smooth it is. The
>>> visible VRAM evictions result in constant 100-200ms stalls but not
>>> every frame, which feels like the frame rate is much lower than it
>>> actually is.
>>>
>>> Make sure your graphics details are maxed out. The best score I can
>>> get with my rig is 70 fps. (Fiji & Core i5 3570)
>>
>> I'm getting around 53-54 fps at Ultra with Tonga, both with Mesa 13.0.6
>> and Git.
>>
>> Have you tried if Christian's patches for CPU access to split VRAM
>> buffers help? I can imagine that forcing contiguous VRAM buffers for CPU
>> access could cause lots of other BOs to be unnecessarily evicted from
>> VRAM, if at least one of their fragments happens to be in the CPU
>> visible part of VRAM.
> 
> I've finally tested latest amd-staging-4.9 and I'm very pleased. For
> the first time, the Deus Ex benchmark has almost no hiccups. I've
> never seen it so smooth. At one point, the MB/s BO move rate increase
> to 200MB/s, stayed there for a couple of seconds, and then it dropped
> to 0 again. The frame rate was OK-ish, so I guess the moves didn't
> happen all at once. I also tested DiRT Rally and I haven't been able
> to reproduce the low FPS with the consistently-high BO move rate that
> I saw several months ago.
> 
> We could do some move throttling there for sure, but it's much better
> than it ever was.

That's great to hear. If you get a chance, it would be interesting if
the attached updated patch improves things even more for you. (The patch
I attached previously couldn't work as intended, this one at least might :)


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 7f9710502bcc..78362e09cc51 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -205,7 +205,44 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo,
 	case TTM_PL_VRAM:
 		if (adev->mman.buffer_funcs_ring->ready == false) {
 			amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_CPU);
+		} else if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {
+			unsigned fpfn = adev->mc.visible_vram_size >> PAGE_SHIFT;
+			int i;
+
+			if (bo->mem.start >= fpfn) {
+				struct drm_mm_node *node = bo->mem.mm_node;
+				unsigned long pages_left;
+
+				for (pages_left = bo->mem.num_pages; pages_left;
+				     pages_left -= node->size, node++) {
+					if (node->start < fpfn)
+						break;
+				}
+
+				if (!pages_left)
+					goto gtt;
+			}
+
+			/* Try evicting to the CPU inaccessible part of VRAM
+			 * first, but only set GTT as busy placement, so this
+			 * BO will be evicted to GTT rather than causing other
+			 * BOs to be evicted from VRAM
+			 */
+			amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM |
+							 AMDGPU_GEM_DOMAIN_GTT);
+			abo->placement.num_busy_placement = 0;
+			for (i = 0; i < abo->placement.num_placement; i++) {
+				if (abo->placements[i].flags & TTM_PL_FLAG_VRAM) {
+					if (abo->placements[i].fpfn < fpfn)
+						abo->placements[i].fpfn = fpfn;
+				} else {
+					abo->placement.busy_placement =
+						&abo->placements[i];
+					abo->placement.num_busy_placement = 1;
+				}
+			}
 		} else {
+gtt:
 			amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
 			for (i = 0; i < abo->placement.num_placement; ++i) {
 				if (!(abo->placements[i].flags &
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to