Re: [pygame] OBJ loader using VBOs
On Sat, Dec 4, 2010 at 5:49 PM, Ian Mallett wrote: > Buffer binding is one of the slowest GL calls you can do, short of > transferring huge chunks of data around (glTexImage2D, glReadPixels, etc.). > State changing is one of the worst things you can do for efficiency, > especially on top of a scripting language where the overhead is much > higher. You have 40 sprites, each with 3 VBO bindings, and 5 texture > bindings. If I'm understanding right, that's 120 VBO bindings and 200 > texture bindings! Transferring this data across the bus once (when you put > it in a display list) will speed things up greatly, but you'll notice that > the framerate in 6 is still much lower than in 2 or 4. > > If you're just drawing sprites, chances are you don't need very many > textures. At the very least, you can use a texture atlas, or batch calls by > the texture required. > > I don't know exactly the situation you're in here, but unless the 40 > sprites all have different geometry, you need only bind the data once, and > then call glDrawArrays 40 times. > > In general, try to minimize binding calls, such as glUseShader, > glBindFramebuffer, glBindTexture, and glBindBuffer. > > Yeah, that makes a lot of sense. I'm not sure how I could implement this advice in the standalone OBJ loader without imposing a lot of restrictions on how it could be used. For instance, I can see how you might make it more efficient when the sprites are sorted by the texture required. But I can't see how to do that and also make it so it doesn't break horribly if the sprites are unsorted. I'll think about it, but for now I might stick with the display lists. My test is not precise enough to worry about the difference between 113fps and 121fps. I think based on the test that method #6 is effectively as fast as #4. I may try to squeeze a few more frames per second out, in which case I'll make a more controlled test, but mostly I was interested in getting the three-order-of-magnitude speedup in loading that you see between #2 and #4/6. -Christopher
Re: [pygame] OBJ loader using VBOs
On Sat, Dec 4, 2010 at 3:37 PM, Christopher Night wrote: > For this model, glEnableClientState gets called once for vertices, normals, > and texcoords. There are 5 materials, each with one glBindTexture and two > glDrawArrays (one for triangles, one for quads). So the total calls per > render is: > > 2 x glEnable/glDisable > 1 x glFrontFace > 3 x vbo.bind > 3 x glEnableClientState > 10 x glDrawArrays > 5 x glBindTexture > 4 x glColor > > And I'm rendering 40 sprites, so I'm doing this 40 times per frame. I'm > assuming that in a real application, each model would have its own separate > VBOs. Is that what I'm doing wrong? Or is there something else? > Buffer binding is one of the slowest GL calls you can do, short of transferring huge chunks of data around (glTexImage2D, glReadPixels, etc.). State changing is one of the worst things you can do for efficiency, especially on top of a scripting language where the overhead is much higher. You have 40 sprites, each with 3 VBO bindings, and 5 texture bindings. If I'm understanding right, that's 120 VBO bindings and 200 texture bindings! Transferring this data across the bus once (when you put it in a display list) will speed things up greatly, but you'll notice that the framerate in 6 is still much lower than in 2 or 4. If you're just drawing sprites, chances are you don't need very many textures. At the very least, you can use a texture atlas, or batch calls by the texture required. I don't know exactly the situation you're in here, but unless the 40 sprites all have different geometry, you need only bind the data once, and then call glDrawArrays 40 times. In general, try to minimize binding calls, such as glUseShader, glBindFramebuffer, glBindTexture, and glBindBuffer. > The reason it takes so long to load on 2 is generating the display list. > This method was taken from the objloader on the wiki, and it involves 1646 > glVertex3f calls, one for each vertex in the model, and similarly with > glNormal and glTexCoords. > > Thanks again! > > -Christopher > Ian
Re: [pygame] OBJ loader using VBOs
On Sat, Dec 4, 2010 at 4:05 PM, Ian Mallett wrote: > > On Sat, Dec 4, 2010 at 1:21 PM, Christopher Night > wrote: > >> Hi, I'm working on a standalone OBJ loader based on the well-known one on >> the pygame wiki: >> http://www.pygame.org/wiki/OBJFileLoader >> >> My goal is to speed up load times by making the model objects picklable, >> so the OBJ file doesn't have to be read every time you start up. Here's my >> current version: >> http://christophernight.net/stuff/fasterobj-0.tgz >> >> It still needs some cleaning up, but it's got almost all the functionality >> I wanted. In addition to making things picklable, it has a small >> optimization by combining triangles and quads when possible to reduce the >> number of GL calls. >> >> There are three classes: OBJ (using fixed function), OBJ_array (using >> vertex arrays), and OBJ_vbo (using vertex buffer objects). Additionally, any >> of these can be used with or without a display list. Here's the results of >> my test on some model I had lying around: >> >>type list? parse save load render >> 1. fixed False 146 13 140.03fps >> 2. fixed True 124 10 950 117.80fps >> 3. array False 179891.26fps >> 4. array True 1747 30 121.08fps >> 5. vbo False 14378 16.06fps >> 6. vboTrue 1428 12 112.98fps >> >> #2 is the method in the original OBJ loader. The times listed under parse, >> save, and load are times in milliseconds to read from the OBJ file and do >> some preprocessing, pickle to a file, and unpickle from a file. The load >> step also includes generating the display list, if necessary. >> > > I completely would have expected the results in 1-4. > > However, I'm quite surprised at the vbo method 5. It should run in speed > between 2 and 4. I also would have expected 4 and 6 to be much closer. > > How many VBOs are you using? If you switch buffer bindings a lot for each > draw (like your object has 10 different parts, each with a vertex, normal, > and texcoord VBO) then you *might* get results like that . . . > Awesome, thanks so much for taking a look! I'm using 3 VBOs, one each for vertex, normal, and texcoord. This is the entire rendering code for OBJ_vbo: glEnable(GL_TEXTURE_2D) glFrontFace(GL_CCW) self.vbo_v.bind() glVertexPointerf(self.vbo_v) self.vbo_n.bind() glNormalPointerf(self.vbo_n) self.vbo_t.bind() glTexCoordPointerf(self.vbo_t) glEnableClientState(GL_VERTEX_ARRAY) texon, normon = None, None for material, mindices in self.indices: self.mtl.bind(material) for nvs, dotex, donorm, ioffset, isize in mindices: if donorm != normon: normon = donorm (glEnableClientState if donorm else glDisableClientState)(GL_NORMAL_ARRAY) if dotex != texon: texon = dotex (glEnableClientState if dotex else glDisableClientState)(GL_TEXTURE_COORD_ARRAY) shape = [GL_TRIANGLES, GL_QUADS, GL_POLYGON][nvs-3] glDrawArrays(shape, ioffset, isize) glDisable(GL_TEXTURE_2D) For this model, glEnableClientState gets called once for vertices, normals, and texcoords. There are 5 materials, each with one glBindTexture and two glDrawArrays (one for triangles, one for quads). So the total calls per render is: 2 x glEnable/glDisable 1 x glFrontFace 3 x vbo.bind 3 x glEnableClientState 10 x glDrawArrays 5 x glBindTexture 4 x glColor And I'm rendering 40 sprites, so I'm doing this 40 times per frame. I'm assuming that in a real application, each model would have its own separate VBOs. Is that what I'm doing wrong? Or is there something else? The reason it takes so long to load on 2 is generating the display list. This method was taken from the objloader on the wiki, and it involves 1646 glVertex3f calls, one for each vertex in the model, and similarly with glNormal and glTexCoords. Thanks again! -Christopher
Re: [pygame] OBJ loader using VBOs
Hi, On Sat, Dec 4, 2010 at 1:21 PM, Christopher Night wrote: > Hi, I'm working on a standalone OBJ loader based on the well-known one on > the pygame wiki: > http://www.pygame.org/wiki/OBJFileLoader > > My goal is to speed up load times by making the model objects picklable, so > the OBJ file doesn't have to be read every time you start up. Here's my > current version: > http://christophernight.net/stuff/fasterobj-0.tgz > > It still needs some cleaning up, but it's got almost all the functionality > I wanted. In addition to making things picklable, it has a small > optimization by combining triangles and quads when possible to reduce the > number of GL calls. > > There are three classes: OBJ (using fixed function), OBJ_array (using > vertex arrays), and OBJ_vbo (using vertex buffer objects). Additionally, any > of these can be used with or without a display list. Here's the results of > my test on some model I had lying around: > >type list? parse save load render > 1. fixed False 146 13 140.03fps > 2. fixed True 124 10 950 117.80fps > 3. array False 179891.26fps > 4. array True 1747 30 121.08fps > 5. vbo False 14378 16.06fps > 6. vboTrue 1428 12 112.98fps > > #2 is the method in the original OBJ loader. The times listed under parse, > save, and load are times in milliseconds to read from the OBJ file and do > some preprocessing, pickle to a file, and unpickle from a file. The load > step also includes generating the display list, if necessary. Obviously > methods #1 and #3 render far too slow; they're just there for comparison. > > So anyway, it looks pretty good. I think that #4 or #6 would do fine for my > purposes. However, I know that people don't like to put vertex arrays and > VBOs inside display lists, so I want to know if there's some problem with > this method. I understand that putting a VBO in a display list defeats the > whole purpose of having a VBO, since you can't update it, but I imagine > you're probably not going to be doing that with OBJ models anyway. Also, > when I asked about this a few months ago, someone said that method #5 should > outperform methods #1-4, and that doesn't seem to be the case. So I might be > misusing the VBOs. > I completely would have expected the results in 1-4. However, I'm quite surprised at the vbo method 5. It should run in speed between 2 and 4. I also would have expected 4 and 6 to be much closer. How many VBOs are you using? If you switch buffer bindings a lot for each draw (like your object has 10 different parts, each with a vertex, normal, and texcoord VBO) then you *might* get results like that . . . > Any other comments welcome too! If you have any OBJ files you want me to > test, just let me know. > This is great, actually. I imagine pickling could make things much faster. Wonder why it took longer to load on 2? > -Christopher > Ian
[pygame] OBJ loader using VBOs
Hi, I'm working on a standalone OBJ loader based on the well-known one on the pygame wiki: http://www.pygame.org/wiki/OBJFileLoader My goal is to speed up load times by making the model objects picklable, so the OBJ file doesn't have to be read every time you start up. Here's my current version: http://christophernight.net/stuff/fasterobj-0.tgz It still needs some cleaning up, but it's got almost all the functionality I wanted. In addition to making things picklable, it has a small optimization by combining triangles and quads when possible to reduce the number of GL calls. There are three classes: OBJ (using fixed function), OBJ_array (using vertex arrays), and OBJ_vbo (using vertex buffer objects). Additionally, any of these can be used with or without a display list. Here's the results of my test on some model I had lying around: type list? parse save load render 1. fixed False 146 13 140.03fps 2. fixed True 124 10 950 117.80fps 3. array False 179891.26fps 4. array True 1747 30 121.08fps 5. vbo False 14378 16.06fps 6. vboTrue 1428 12 112.98fps #2 is the method in the original OBJ loader. The times listed under parse, save, and load are times in milliseconds to read from the OBJ file and do some preprocessing, pickle to a file, and unpickle from a file. The load step also includes generating the display list, if necessary. Obviously methods #1 and #3 render far too slow; they're just there for comparison. So anyway, it looks pretty good. I think that #4 or #6 would do fine for my purposes. However, I know that people don't like to put vertex arrays and VBOs inside display lists, so I want to know if there's some problem with this method. I understand that putting a VBO in a display list defeats the whole purpose of having a VBO, since you can't update it, but I imagine you're probably not going to be doing that with OBJ models anyway. Also, when I asked about this a few months ago, someone said that method #5 should outperform methods #1-4, and that doesn't seem to be the case. So I might be misusing the VBOs. Any other comments welcome too! If you have any OBJ files you want me to test, just let me know. -Christopher