On Sat, Mar 13, 2010 at 1:30 PM, Luca Barbieri <luca.barbi...@gmail.com> wrote: >> Having user-buffers with undefined size establishes a connection >> inside the driver between two things which could previously be fully >> understood separately - the vertex buffer is now no longer fully >> defined without reference to an index buffer. Effectively the user >> buffer become just unqualified pointers and we are back to the GL >> world pre-VBOs. > > Yes, indeed. > >> In your examples, scanning and uploading (or transforming) per-index >> is only something which is sensible in special cases - eg where there >> are very few indices or sparse access to a large vertex buffer that >> hasn't already been uploaded/transformed. But you can't even know if >> the vertex buffer is sparse until you know how big it is, ie. what >> max_index is... > > Reconsidering it, it does indeed seem to be sensible to always scan > the index buffer for min/max if any elements are in user buffers, > since if we discover it is dense, we should upload and otherwise do > the index lookup in software (or perhaps remap the indices, but this > isn't necessarily a good idea). > > And if we are uploading buffers, we must always do the scan to avoid > generating segfaults due to out-of-bound reads. > Hardware without draw_elements support could do without the scan, but > I think all Gallium-capable hardware supports it. > > So it seems the only case where we don't necessarily need it is for > swtnl, and here the performance loss due to scanning is probably > insignificant compared to the cost of actually transforming vertices. > > So, yes, I think you are right and the current solution is the best. > > However, I still have doubts on the semantics of max_index in > pipe_vertex_buffer. > Isn't it better to _always_ set it to a valid value, even if it is > just (vb->buffer->size - vb->buffer_offset) / vb->stride ?
Yes, indeed. Sorry I must have missed that point in the earlier emails. I would have thought that was what it was *always* set to (and thus perhaps redundant). > It seems this would solve Corbin's problem and make a better > interface, following your principle of having well defined vertex > buffers. > The only cost is doing up to 8/16 divisions, but the driver may needs > to do them anyway. > > Perhaps we could amortize this by only creating/setting the > pipe_vertex_buffers/elements on VBO/array change instead that on every > draw call like we seem to be doing now, in cases where it is possible. Indeed that would be an improvement on the current situation. > An alternative option could be to remove it, and have the driver do > the computation itself instead (and use draw_range_elements to pass a > user-specified or scanned max value). I think I prefer the first approach, to try and reduce rework everywhere. Keith ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev