> Having user-buffers with undefined size establishes a connection > inside the driver between two things which could previously be fully > understood separately - the vertex buffer is now no longer fully > defined without reference to an index buffer. Effectively the user > buffer become just unqualified pointers and we are back to the GL > world pre-VBOs.
Yes, indeed. > In your examples, scanning and uploading (or transforming) per-index > is only something which is sensible in special cases - eg where there > are very few indices or sparse access to a large vertex buffer that > hasn't already been uploaded/transformed. But you can't even know if > the vertex buffer is sparse until you know how big it is, ie. what > max_index is... Reconsidering it, it does indeed seem to be sensible to always scan the index buffer for min/max if any elements are in user buffers, since if we discover it is dense, we should upload and otherwise do the index lookup in software (or perhaps remap the indices, but this isn't necessarily a good idea). And if we are uploading buffers, we must always do the scan to avoid generating segfaults due to out-of-bound reads. Hardware without draw_elements support could do without the scan, but I think all Gallium-capable hardware supports it. So it seems the only case where we don't necessarily need it is for swtnl, and here the performance loss due to scanning is probably insignificant compared to the cost of actually transforming vertices. So, yes, I think you are right and the current solution is the best. However, I still have doubts on the semantics of max_index in pipe_vertex_buffer. Isn't it better to _always_ set it to a valid value, even if it is just (vb->buffer->size - vb->buffer_offset) / vb->stride ? It seems this would solve Corbin's problem and make a better interface, following your principle of having well defined vertex buffers. The only cost is doing up to 8/16 divisions, but the driver may needs to do them anyway. Perhaps we could amortize this by only creating/setting the pipe_vertex_buffers/elements on VBO/array change instead that on every draw call like we seem to be doing now, in cases where it is possible. An alternative option could be to remove it, and have the driver do the computation itself instead (and use draw_range_elements to pass a user-specified or scanned max value). ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev