Re: [Mesa-dev] [PATCH 3/3] i965/gen6-7/sol: Bump primitive counter BO size.
Eero Tamminenwrites: > Hi, > > Tested-By: Eero Tamminen > > On 18.11.2017 00:28, Francisco Jerez wrote: >> Improves performance of SynMark2 OglGSCloth by a further 9.65%±0.59% >> due to the reduction in overwraps of the primitive count buffer that >> lead to a CPU stall on previous rendering. Cummulative performance >> improvement from the series 81.50% ±0.96% (data gathered on VLV). > > I tested the patch series with transform feedback using tests on SNB > GT2, BYT, HSW GT2 and BSW, using git versions of Mesa, drm-tip kernel > and X server. > > > SNB GT2: > * No noticeable perf impact on GfxBench Manhattan > * Mesa unfortunately renders GSCloth incorrectly on SNB, >but that happens also without this patch series: > https://bugs.freedesktop.org/show_bug.cgi?id=103824 > > BYT: > * 1-2% perf improvement in GfxBench Manhattan 3.0 & 3.1 > * 30% perf improvement in GSCloth >- Device is single channel one, was your VLV 2-channel one? > I don't have access to the VLV system today to verify, but your system is likely hitting the bandwidth limits of the system sooner than mine (either because of slower memory clocks or because of single- vs dual-channel), after which point performance doesn't improve further for you because it's fully bandwidth-bound. > HSW GT2: > * No noticeable perf impact > This is also expected, HSW uses the hsw_sol.c XFB implementation which this patch doesn't have any effect on. > BSW: > * No noticeable perf impact (as expected) > > > - Eero > >> --- >> src/mesa/drivers/dri/i965/gen6_sol.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c >> b/src/mesa/drivers/dri/i965/gen6_sol.c >> index b1baf01bcd9..355acd42189 100644 >> --- a/src/mesa/drivers/dri/i965/gen6_sol.c >> +++ b/src/mesa/drivers/dri/i965/gen6_sol.c >> @@ -197,7 +197,7 @@ brw_new_transform_feedback(struct gl_context *ctx, >> GLuint name) >> brw_obj->offset_bo = >> brw_bo_alloc(brw->bufmgr, "transform feedback offsets", 16, 64); >> brw_obj->prim_count_bo = >> - brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 4096, 64); >> + brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 16384, 64); >> >> return _obj->base; >> } >> @@ -287,7 +287,8 @@ brw_save_primitives_written_counters(struct brw_context >> *brw, >> assert(obj->prim_count_bo != NULL); >> >> /* Check if there's enough space for a new pair of four values. */ >> - if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >= 4096) { >> + if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >= >> + obj->prim_count_bo->size) { >> aggregate_transform_feedback_counter(brw, obj->prim_count_bo, >> >previous_counter); >> aggregate_transform_feedback_counter(brw, obj->prim_count_bo, >> > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965/gen6-7/sol: Bump primitive counter BO size.
Hi, Tested-By: Eero TamminenOn 18.11.2017 00:28, Francisco Jerez wrote: Improves performance of SynMark2 OglGSCloth by a further 9.65%±0.59% due to the reduction in overwraps of the primitive count buffer that lead to a CPU stall on previous rendering. Cummulative performance improvement from the series 81.50% ±0.96% (data gathered on VLV). I tested the patch series with transform feedback using tests on SNB GT2, BYT, HSW GT2 and BSW, using git versions of Mesa, drm-tip kernel and X server. SNB GT2: * No noticeable perf impact on GfxBench Manhattan * Mesa unfortunately renders GSCloth incorrectly on SNB, but that happens also without this patch series: https://bugs.freedesktop.org/show_bug.cgi?id=103824 BYT: * 1-2% perf improvement in GfxBench Manhattan 3.0 & 3.1 * 30% perf improvement in GSCloth - Device is single channel one, was your VLV 2-channel one? HSW GT2: * No noticeable perf impact BSW: * No noticeable perf impact (as expected) - Eero --- src/mesa/drivers/dri/i965/gen6_sol.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c b/src/mesa/drivers/dri/i965/gen6_sol.c index b1baf01bcd9..355acd42189 100644 --- a/src/mesa/drivers/dri/i965/gen6_sol.c +++ b/src/mesa/drivers/dri/i965/gen6_sol.c @@ -197,7 +197,7 @@ brw_new_transform_feedback(struct gl_context *ctx, GLuint name) brw_obj->offset_bo = brw_bo_alloc(brw->bufmgr, "transform feedback offsets", 16, 64); brw_obj->prim_count_bo = - brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 4096, 64); + brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 16384, 64); return _obj->base; } @@ -287,7 +287,8 @@ brw_save_primitives_written_counters(struct brw_context *brw, assert(obj->prim_count_bo != NULL); /* Check if there's enough space for a new pair of four values. */ - if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >= 4096) { + if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >= + obj->prim_count_bo->size) { aggregate_transform_feedback_counter(brw, obj->prim_count_bo, >previous_counter); aggregate_transform_feedback_counter(brw, obj->prim_count_bo, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965/gen6-7/sol: Bump primitive counter BO size.
Improves performance of SynMark2 OglGSCloth by a further 9.65%±0.59% due to the reduction in overwraps of the primitive count buffer that lead to a CPU stall on previous rendering. Cummulative performance improvement from the series 81.50% ±0.96% (data gathered on VLV). --- src/mesa/drivers/dri/i965/gen6_sol.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c b/src/mesa/drivers/dri/i965/gen6_sol.c index b1baf01bcd9..355acd42189 100644 --- a/src/mesa/drivers/dri/i965/gen6_sol.c +++ b/src/mesa/drivers/dri/i965/gen6_sol.c @@ -197,7 +197,7 @@ brw_new_transform_feedback(struct gl_context *ctx, GLuint name) brw_obj->offset_bo = brw_bo_alloc(brw->bufmgr, "transform feedback offsets", 16, 64); brw_obj->prim_count_bo = - brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 4096, 64); + brw_bo_alloc(brw->bufmgr, "xfb primitive counts", 16384, 64); return _obj->base; } @@ -287,7 +287,8 @@ brw_save_primitives_written_counters(struct brw_context *brw, assert(obj->prim_count_bo != NULL); /* Check if there's enough space for a new pair of four values. */ - if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >= 4096) { + if ((obj->counter.bo_end + 2) * streams * sizeof(uint64_t) >= + obj->prim_count_bo->size) { aggregate_transform_feedback_counter(brw, obj->prim_count_bo, >previous_counter); aggregate_transform_feedback_counter(brw, obj->prim_count_bo, -- 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev