On Tue, Jul 07, 2015 at 10:31:07AM -0700, Kenneth Graunke wrote: > On Tuesday, July 07, 2015 04:46:22 PM Chris Wilson wrote: > > On Tue, Jul 07, 2015 at 10:12:20AM +0100, Chris Wilson wrote: > > > On Mon, Jul 06, 2015 at 09:05:18PM -0700, Kristian Høgsberg wrote: > > > > On Mon, Jul 6, 2015 at 12:36 PM, Kenneth Graunke > > > > <kenn...@whitecape.org> wrote: > > > > > On Monday, July 06, 2015 11:33:15 AM Chris Wilson wrote: > > > > >> Since the purpose of transform feedback tends to be for the client to > > > > >> act upon the results to change the geometry in the scene, it is > > > > >> likely > > > > >> that the client will soon be waiting upon the results. Flush the > > > > >> batch > > > > >> early so that we don't build up a long queue of commands afterwards > > > > >> that > > > > >> could delay the readback. > > > > >> --- > > > > >> src/mesa/drivers/dri/i965/gen7_sol_state.c | 6 ++++++ > > > > >> 1 file changed, 6 insertions(+) > > > > >> > > > > >> diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c > > > > >> b/src/mesa/drivers/dri/i965/gen7_sol_state.c > > > > >> index 857ebe5..13dbe5b 100644 > > > > >> --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c > > > > >> +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c > > > > >> @@ -494,6 +494,12 @@ gen7_end_transform_feedback(struct gl_context > > > > >> *ctx, > > > > >> > > > > >> brw_batch_end(&brw->batch); > > > > >> > > > > >> + /* We will likely want to read the results in the very near > > > > >> future, so > > > > >> + * push this primitive to hardware if it is currently idle. > > > > >> + */ > > > > >> + if (!brw_batch_busy(&brw->batch)) > > > > >> + brw_batch_flush(&brw->batch); > > > > >> + > > > > >> /* EndTransformFeedback() means that we need to update the > > > > >> number of > > > > >> * vertices written. Since it's only necessary if > > > > >> DrawTransformFeedback() > > > > >> * is called and it means mapping a buffer object, we delay > > > > >> computing it > > > > >> > > > > > > > > > > We need some data to justify this change. > > > > > > > > I think even the theory is not correct - transform feedback is > > > > typically fed back into the GPU (as new geometry, eg) rather than > > > > consumed by the CPU, and in that case the flush is not helpful. But at > > > > the end of the day, data will tell. > > > > > > How are they fed back? Can the xfb buffer be bound to the vertex buffer? > > > (Genuine question! The only examples I've seen were for testing by the > > > CPU.) > > Yes, it can. Just glBindBuffer() some buffers around. Or, I suspect > one could bind it as a texture buffer object or SSBO and then use a > compute shader on the results. > > With GL 4.x, the "avoid synchronizing with the CPU" mentality is a lot > more prevalent, due to the advent of compute shaders. > > > > > I've reviewed the code again, and gen7_end_transform_feedback() is always > > followed by brw_compute_xfb_vertices_written (and a read of the sol > > buffer) afaict, maybe not immediately but always before the next > > transform feedback. > > Sadly, yes. We have a primitive count and we need a vertex count - so, > a tiny bit of math. Ideally, we would use the Gen7.5 MI_MATH+ feature > to do this, eliminating the CPU-GPU synchronization point. > > > Also afaict it is not possible to map the sol buffer directly into the > > application. > > -Chris > > It definitely is - the application creates GL buffer objects and binds > them for use with transform feedback. They can certainly > glMapBufferRange() those buffers.
The trouble I see is that the values stored currently are implementation dependent and often reset. How is the application meant to use them directly? (Just trying to understand a bit better. If it is that the current implementation is stalling when not required, then trying to speed those stalls up really is just lipstick on a pig and irrelevant. The patch was just trying to make a suggestion that feeding the gpu around expected stall points works best with the current batch-level granularity of our fences. Using intrabatch semaphores for the query objects seems a more promising avenue than doing batch flushes anyway.) -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev