Re: [Mesa-dev] [PATCH 1/2] i965: Viewport extents on gen8
On Thu, Jul 03, 2014 at 11:23:13AM -0700, Ben Widawsky wrote: Viewport extents are a 3rd rectangle that defines which pixels get discarded as part of the rasterization process. This can potentially improve performance by reducing cache usage, and freeing up PS cycles. This will get hit if one's viewport is smaller than the full renderbuffer, and scissoring is not used. The functionality itself very much resembles scissoring. --- src/mesa/drivers/dri/i965/gen8_viewport_state.c | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen8_viewport_state.c b/src/mesa/drivers/dri/i965/gen8_viewport_state.c index b366246..2bf5fbb 100644 --- a/src/mesa/drivers/dri/i965/gen8_viewport_state.c +++ b/src/mesa/drivers/dri/i965/gen8_viewport_state.c @@ -86,17 +86,23 @@ gen8_upload_sf_clip_viewport(struct brw_context *brw) vp[10] = -gby; /* y-min */ vp[11] = gby; /* y-max */ - /* _NEW_SCISSOR | _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport */ + /* _NEW_SCISSOR | _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport + * The hardware will take the intersection of the drawing rectangle, + * scissor rectangle, and the viewport extents. We don't need to be + * smart, and can therefore just program the viewport extents. + */ + float viewport_Xmax = ctx-ViewportArray[i].X + ctx-ViewportArray[i].Width; + float viewport_Ymax = ctx-ViewportArray[i].Y + ctx-ViewportArray[i].Height; These could be both declared as constants and the right hand sides should go to their own lines as they are now overflowing the 80-char limit. Otherwise this looks to me as the right thing to do, and not only from performance point of view. I'm thinking a case where the limits for the drawbuffer are not set but where the viewport limits are - with the current logic we wouldn't apply the limits, right? I guess we don't have any such piglit test case. But then I also realized that if we applied this, then gen8 wouldn't take the drawbuffer limits into account anywhere else. So this should break some piglit tests? I tried to look at the earlier generations from six onwards and actually couldn't find any state atom using the drawbuffer limits. (The only reference is SF-scissor setting in brw_sf_state.c:upload_sf_vp() which is for gen 6). I guess I can say I'm confused. if (render_to_fbo) { - vp[12] = ctx-DrawBuffer-_Xmin; - vp[13] = ctx-DrawBuffer-_Xmax - 1; - vp[14] = ctx-DrawBuffer-_Ymin; - vp[15] = ctx-DrawBuffer-_Ymax - 1; + vp[12] = ctx-ViewportArray[i].X; + vp[13] = viewport_Xmax - 1; + vp[14] = ctx-ViewportArray[i].Y; + vp[15] = viewport_Ymax - 1; } else { - vp[12] = ctx-DrawBuffer-_Xmin; - vp[13] = ctx-DrawBuffer-_Xmax - 1; - vp[14] = ctx-DrawBuffer-Height - ctx-DrawBuffer-_Ymax; - vp[15] = ctx-DrawBuffer-Height - ctx-DrawBuffer-_Ymin - 1; + vp[12] = ctx-ViewportArray[i].X; + vp[13] = viewport_Xmax - 1; + vp[14] = ctx-DrawBuffer-Height - viewport_Ymax; + vp[15] = ctx-DrawBuffer-Height - ctx-ViewportArray[i].Y - 1; } vp += 16; -- 2.0.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Viewport extents on gen8
On Thu, Jul 24, 2014 at 10:29:11AM +0300, Pohjolainen, Topi wrote: On Thu, Jul 03, 2014 at 11:23:13AM -0700, Ben Widawsky wrote: Viewport extents are a 3rd rectangle that defines which pixels get discarded as part of the rasterization process. This can potentially improve performance by reducing cache usage, and freeing up PS cycles. This will get hit if one's viewport is smaller than the full renderbuffer, and scissoring is not used. The functionality itself very much resembles scissoring. --- src/mesa/drivers/dri/i965/gen8_viewport_state.c | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen8_viewport_state.c b/src/mesa/drivers/dri/i965/gen8_viewport_state.c index b366246..2bf5fbb 100644 --- a/src/mesa/drivers/dri/i965/gen8_viewport_state.c +++ b/src/mesa/drivers/dri/i965/gen8_viewport_state.c @@ -86,17 +86,23 @@ gen8_upload_sf_clip_viewport(struct brw_context *brw) vp[10] = -gby; /* y-min */ vp[11] = gby; /* y-max */ - /* _NEW_SCISSOR | _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport */ + /* _NEW_SCISSOR | _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport + * The hardware will take the intersection of the drawing rectangle, + * scissor rectangle, and the viewport extents. We don't need to be + * smart, and can therefore just program the viewport extents. + */ + float viewport_Xmax = ctx-ViewportArray[i].X + ctx-ViewportArray[i].Width; + float viewport_Ymax = ctx-ViewportArray[i].Y + ctx-ViewportArray[i].Height; These could be both declared as constants and the right hand sides should go to their own lines as they are now overflowing the 80-char limit. Got it. Otherwise this looks to me as the right thing to do, and not only from performance point of view. I'm thinking a case where the limits for the drawbuffer are not set but where the viewport limits are - with the current logic we wouldn't apply the limits, right? I guess we don't have any such piglit test case. I'm new here. I don't understand. Can you explain what you mean by drawbuffer limits not being set set? I wasn't really aware such a thing was possible (if I followed your meaning). But then I also realized that if we applied this, then gen8 wouldn't take the drawbuffer limits into account anywhere else. So this should break some piglit tests? I didn't run full, but quick didn't have any regressions. I tried to look at the earlier generations from six onwards and actually couldn't find any state atom using the drawbuffer limits. (The only reference is SF-scissor setting in brw_sf_state.c:upload_sf_vp() which is for gen 6). I guess I can say I'm confused. if (render_to_fbo) { - vp[12] = ctx-DrawBuffer-_Xmin; - vp[13] = ctx-DrawBuffer-_Xmax - 1; - vp[14] = ctx-DrawBuffer-_Ymin; - vp[15] = ctx-DrawBuffer-_Ymax - 1; + vp[12] = ctx-ViewportArray[i].X; + vp[13] = viewport_Xmax - 1; + vp[14] = ctx-ViewportArray[i].Y; + vp[15] = viewport_Ymax - 1; } else { - vp[12] = ctx-DrawBuffer-_Xmin; - vp[13] = ctx-DrawBuffer-_Xmax - 1; - vp[14] = ctx-DrawBuffer-Height - ctx-DrawBuffer-_Ymax; - vp[15] = ctx-DrawBuffer-Height - ctx-DrawBuffer-_Ymin - 1; + vp[12] = ctx-ViewportArray[i].X; + vp[13] = viewport_Xmax - 1; + vp[14] = ctx-DrawBuffer-Height - viewport_Ymax; + vp[15] = ctx-DrawBuffer-Height - ctx-ViewportArray[i].Y - 1; } vp += 16; I'll take a look, but Ken may have a more immediate answer. I was distracted by other things for the 3 weeks, but I am back looking at this now (and the GB clipping as well). My quick answer is that hardware will just do the right thing (I only looked at GEN8), but I need to read the docs further. I've yet to find a perf win, though some benchmarks do hit this path. I'll do some more piglit testing as well, and update with that info. -- Ben Widawsky, Intel Open Source Technology Center ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Viewport extents on gen8
On Thu, Jul 24, 2014 at 10:22:25AM -0700, Ben Widawsky wrote: On Thu, Jul 24, 2014 at 10:29:11AM +0300, Pohjolainen, Topi wrote: On Thu, Jul 03, 2014 at 11:23:13AM -0700, Ben Widawsky wrote: Viewport extents are a 3rd rectangle that defines which pixels get discarded as part of the rasterization process. This can potentially improve performance by reducing cache usage, and freeing up PS cycles. This will get hit if one's viewport is smaller than the full renderbuffer, and scissoring is not used. The functionality itself very much resembles scissoring. --- src/mesa/drivers/dri/i965/gen8_viewport_state.c | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen8_viewport_state.c b/src/mesa/drivers/dri/i965/gen8_viewport_state.c index b366246..2bf5fbb 100644 --- a/src/mesa/drivers/dri/i965/gen8_viewport_state.c +++ b/src/mesa/drivers/dri/i965/gen8_viewport_state.c @@ -86,17 +86,23 @@ gen8_upload_sf_clip_viewport(struct brw_context *brw) vp[10] = -gby; /* y-min */ vp[11] = gby; /* y-max */ - /* _NEW_SCISSOR | _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport */ + /* _NEW_SCISSOR | _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport + * The hardware will take the intersection of the drawing rectangle, + * scissor rectangle, and the viewport extents. We don't need to be + * smart, and can therefore just program the viewport extents. + */ + float viewport_Xmax = ctx-ViewportArray[i].X + ctx-ViewportArray[i].Width; + float viewport_Ymax = ctx-ViewportArray[i].Y + ctx-ViewportArray[i].Height; These could be both declared as constants and the right hand sides should go to their own lines as they are now overflowing the 80-char limit. Got it. Otherwise this looks to me as the right thing to do, and not only from performance point of view. I'm thinking a case where the limits for the drawbuffer are not set but where the viewport limits are - with the current logic we wouldn't apply the limits, right? I guess we don't have any such piglit test case. I'm new here. I don't understand. Can you explain what you mean by drawbuffer limits not being set set? I wasn't really aware such a thing was possible (if I followed your meaning). I'm asking for the same reason, I don't know how these things are set in the core. I meant the DrawBuffer::_Xmin/Xmax/Ymin/Ymax. So I'm thinking a case where DrawBuffer::_Xmin/Xmax/Ymin/Ymax do not impose any limits but where ViewportArray[] do. Current logic wouldn't take them into account but your version would. But then on the other hand after your change I don't see how the constraints in DrawBuffer::_Xmin/Xmax/Ymin/Ymax would be taken into account. But then I also realized that if we applied this, then gen8 wouldn't take the drawbuffer limits into account anywhere else. So this should break some piglit tests? I didn't run full, but quick didn't have any regressions. I tried to look at the earlier generations from six onwards and actually couldn't find any state atom using the drawbuffer limits. (The only reference is SF-scissor setting in brw_sf_state.c:upload_sf_vp() which is for gen 6). I guess I can say I'm confused. if (render_to_fbo) { - vp[12] = ctx-DrawBuffer-_Xmin; - vp[13] = ctx-DrawBuffer-_Xmax - 1; - vp[14] = ctx-DrawBuffer-_Ymin; - vp[15] = ctx-DrawBuffer-_Ymax - 1; + vp[12] = ctx-ViewportArray[i].X; + vp[13] = viewport_Xmax - 1; + vp[14] = ctx-ViewportArray[i].Y; + vp[15] = viewport_Ymax - 1; } else { - vp[12] = ctx-DrawBuffer-_Xmin; - vp[13] = ctx-DrawBuffer-_Xmax - 1; - vp[14] = ctx-DrawBuffer-Height - ctx-DrawBuffer-_Ymax; - vp[15] = ctx-DrawBuffer-Height - ctx-DrawBuffer-_Ymin - 1; + vp[12] = ctx-ViewportArray[i].X; + vp[13] = viewport_Xmax - 1; + vp[14] = ctx-DrawBuffer-Height - viewport_Ymax; + vp[15] = ctx-DrawBuffer-Height - ctx-ViewportArray[i].Y - 1; } vp += 16; I'll take a look, but Ken may have a more immediate answer. I was distracted by other things for the 3 weeks, but I am back looking at this now (and the GB clipping as well). My quick answer is that hardware will just do the right thing (I only looked at GEN8), but I need to read the docs further. I've yet to find a perf win, though some benchmarks do hit this path. I'll do some more piglit testing as well, and update with that info. -- Ben Widawsky, Intel Open Source Technology Center ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev