Re: [pygame] Faster OBJ loader
ps. if the obj has multiple objects using the same textures/materials, then only loading that material once and sharing it can give big speedups. Not generally the case in simple single models, but more so if it is a really big scene.
Re: [pygame] Faster OBJ loader
On Mon, Sep 27, 2010 at 8:05 PM, Christopher Night cosmologi...@gmail.comwrote: Thanks for the response! Assuming that rendering means what I think it means (actually drawing the thing on the screen, right?), Absolutely. I was able to improve the rendering performance significantly using vertex arrays in the test I did a few months ago. I was still using a display list as well, but I greatly reduced the number of GL commands within the display list. The trick was to triangulate all the faces, and render all the faces for a given material using a single call to glDrawArrays(GL_TRIANGLES...). I realize this is hardware-dependent, but the speedup was dramatic on 2 out of 2 systems that I've tried. Maybe it's not the vertex arrays that matter, and triangulating all the faces and using a single call to glBegin(GL_TRIANGLES) would yield the same speedup. Either way, it's worth looking into I think You must not be using the calls as you think you are, then. Graphics cards have a graphics bus that handles data transfer to and from the card. Unfortunately, this graphics bus is slower than either the CPU or the GPU themselves. Fixed function (glVertex3f(...), glNormal3f(...), etc.) sends the data to the card on each call. So, if you have 900 such calls, you send 900 state changes to OpenGL across the graphics bus each time the data is drawn. This can get slow. Vertex arrays work similarly, except the data is stored as an array, and the equivalent of the 900 fixed function calls are sent across the graphics bus each frame. Although this batched approach is faster than fixed function, all the data is still transferred to the card each time the data is drawn. Display lists work by caching operations on the graphics card. You can specify nearly anything inside a display list, including fixed function (and I think) vertex arrays. To use display lists, you wrap the drawing code in glGenLists()/glNewList() and glEndList() calls. The code inside, after being transferred to the GPU, is stored for later use. Later, you can call glCallLists(), with the appropriate list argument. The list's NUMBER is transferred to the GPU, and the relevant set of cached operations is executed. The practical upshot of all this is that each time you draw the object, you pass a single number to the graphics card, and the appropriate cached operations are executed. This is the way the Wiki .obj loader works. Vertex Buffer Objects are a more advanced topic, but they work by caching vertex arrays on the GPU. Improved loading performance comes from the fact that these arrays can be pickled, so you don't have to read them directly from the OBJ file after the first time. You only need to distribute the pickled OBJ/MTL files with your actual game. Psyco or C might be just as good, but this solution was pretty simple, and reduces the (subsequent) load times to almost nothing. This is actually a fantastic idea; I love it! -Christopher Ian
Re: [pygame] Faster OBJ loader
It should be noted that display lists are deprecated (but for all intents and purposes still there) in the anti-fixed function opengl 3.1+ (and perhaps completely unavailable in open gl es 2.0?). So learning vbos is probably a good idea, and like display lists it also means you aren't passing vertex arrays over the graphics bus on every draw call because the vertex arrays are stored on the graphics card. The advantage of vbos over display lists is you can store static or dynamic data, the disadvantage is you can only store certain kinds of data and not operations, so on some cards display lists are still faster than vbos. Devon On Tue, Sep 28, 2010 at 1:05 PM, Ian Mallett geometr...@gmail.com wrote: On Mon, Sep 27, 2010 at 8:05 PM, Christopher Night cosmologi...@gmail.com wrote: Thanks for the response! Assuming that rendering means what I think it means (actually drawing the thing on the screen, right?), Absolutely. I was able to improve the rendering performance significantly using vertex arrays in the test I did a few months ago. I was still using a display list as well, but I greatly reduced the number of GL commands within the display list. The trick was to triangulate all the faces, and render all the faces for a given material using a single call to glDrawArrays(GL_TRIANGLES...). I realize this is hardware-dependent, but the speedup was dramatic on 2 out of 2 systems that I've tried. Maybe it's not the vertex arrays that matter, and triangulating all the faces and using a single call to glBegin(GL_TRIANGLES) would yield the same speedup. Either way, it's worth looking into I think You must not be using the calls as you think you are, then. Graphics cards have a graphics bus that handles data transfer to and from the card. Unfortunately, this graphics bus is slower than either the CPU or the GPU themselves. Fixed function (glVertex3f(...), glNormal3f(...), etc.) sends the data to the card on each call. So, if you have 900 such calls, you send 900 state changes to OpenGL across the graphics bus each time the data is drawn. This can get slow. Vertex arrays work similarly, except the data is stored as an array, and the equivalent of the 900 fixed function calls are sent across the graphics bus each frame. Although this batched approach is faster than fixed function, all the data is still transferred to the card each time the data is drawn. Display lists work by caching operations on the graphics card. You can specify nearly anything inside a display list, including fixed function (and I think) vertex arrays. To use display lists, you wrap the drawing code in glGenLists()/glNewList() and glEndList() calls. The code inside, after being transferred to the GPU, is stored for later use. Later, you can call glCallLists(), with the appropriate list argument. The list's NUMBER is transferred to the GPU, and the relevant set of cached operations is executed. The practical upshot of all this is that each time you draw the object, you pass a single number to the graphics card, and the appropriate cached operations are executed. This is the way the Wiki .obj loader works. Vertex Buffer Objects are a more advanced topic, but they work by caching vertex arrays on the GPU. Improved loading performance comes from the fact that these arrays can be pickled, so you don't have to read them directly from the OBJ file after the first time. You only need to distribute the pickled OBJ/MTL files with your actual game. Psyco or C might be just as good, but this solution was pretty simple, and reduces the (subsequent) load times to almost nothing. This is actually a fantastic idea; I love it! -Christopher Ian
Re: [pygame] Faster OBJ loader
On Tue, Sep 28, 2010 at 1:05 PM, Devon Scott-Tunkin devon.scotttun...@gmail.com wrote: It should be noted that display lists are deprecated (but for all intents and purposes still there) in the anti-fixed function opengl 3.1+ (and perhaps completely unavailable in open gl es 2.0?). So learning vbos is probably a good idea, and like display lists it also means you aren't passing vertex arrays over the graphics bus on every draw call because the vertex arrays are stored on the graphics card. The advantage of vbos over display lists is you can store static or dynamic data, the disadvantage is you can only store certain kinds of data and not operations, so on some cards display lists are still faster than vbos. Correct. Personally, I think display lists are so ubiquitous that they aren't going away for decades. I agree that VBOs are the future though. They also have the advantage that the program field of vertex attributes can be dynamically altered. This is important for programmable shading, particularly, when you want to use potentially more than one shader to draw a single object.
Re: [pygame] Faster OBJ loader
Ian Mallett wrote: Correct. Personally, I think display lists are so ubiquitous that they aren't going away for decades. I agree that VBOs are the future though. Note that display lists and VBOs don't have to be mutually exclusive. A display list could contain instructions to draw from VBOs. -- Greg
Re: [pygame] Faster OBJ loader
On Tue, Sep 28, 2010 at 2:05 PM, Ian Mallett geometr...@gmail.com wrote: On Mon, Sep 27, 2010 at 8:05 PM, Christopher Night cosmologi...@gmail.com wrote: I was able to improve the rendering performance significantly using vertex arrays in the test I did a few months ago. I was still using a display list as well, but I greatly reduced the number of GL commands within the display list. The trick was to triangulate all the faces, and render all the faces for a given material using a single call to glDrawArrays(GL_TRIANGLES...). I realize this is hardware-dependent, but the speedup was dramatic on 2 out of 2 systems that I've tried. Maybe it's not the vertex arrays that matter, and triangulating all the faces and using a single call to glBegin(GL_TRIANGLES) would yield the same speedup. Either way, it's worth looking into I think You must not be using the calls as you think you are, then. Graphics cards have a graphics bus that handles data transfer to and from the card. Unfortunately, this graphics bus is slower than either the CPU or the GPU themselves. Fixed function (glVertex3f(...), glNormal3f(...), etc.) sends the data to the card on each call. So, if you have 900 such calls, you send 900 state changes to OpenGL across the graphics bus each time the data is drawn. This can get slow. Vertex arrays work similarly, except the data is stored as an array, and the equivalent of the 900 fixed function calls are sent across the graphics bus each frame. Although this batched approach is faster than fixed function, all the data is still transferred to the card each time the data is drawn. Display lists work by caching operations on the graphics card. You can specify nearly anything inside a display list, including fixed function (and I think) vertex arrays. To use display lists, you wrap the drawing code in glGenLists()/glNewList() and glEndList() calls. The code inside, after being transferred to the GPU, is stored for later use. Later, you can call glCallLists(), with the appropriate list argument. The list's NUMBER is transferred to the GPU, and the relevant set of cached operations is executed. The practical upshot of all this is that each time you draw the object, you pass a single number to the graphics card, and the appropriate cached operations are executed. This is the way the Wiki .obj loader works. Excellent, thank you very much for the explanation. You're absolutely right that I'm likely to not be using the calls like I think I am. :-) I understand now that no matter what's in the display list, only a single number is passed to the graphics card, so there can't be any optimization on the outside. However, wouldn't it be possible for some display lists to execute faster within the graphics card than others? Attached below is a script that demonstrates what I'm talking about. It should render a torus repeatedly for 60 seconds. First without vertex arrays (the way the objloader on the wiki does it), and second with vertex arrays. For me the output is: Without arrays: 58.0fps With arrays: 140.0fps It takes several minutes to run, because the torus has a huge number of faces it has to generate. I had to do that to get the framerate down. Anyway, this is the kind of test that suggests to me that vertex arrays might help. Do you see something wrong with it? As for VBOs, I know I should learn them. If I can figure them out, and I can get the same performance from them, that would be preferable. However, that's just for the sake of using non-deprecated techniques: for an OBJ loader, there wouldn't seem to be much need for dynamic data. -Christopher import pygame from pygame.locals import * from math import sin, cos, pi from OpenGL.GL import * from OpenGL.GLU import * tmax = 60. # Time to run each test for # Generate the torus faces nx, ny = 600, 400 xcos = [cos(x * 2 * pi / nx) for x in range(nx+1)] xsin = [sin(x * 2 * pi / nx) for x in range(nx+1)] ycos = [cos(y * 2 * pi / ny) for y in range(ny+1)] ysin = [sin(y * 2 * pi / ny) for y in range(ny+1)] def coords(x, y): return xcos[x] * (2 + ycos[y]), ysin[y], xsin[x] * (2 + ycos[y]) def normals(x, y): return xcos[x] * ycos[y], ysin[y], xsin[x] * ycos[y] faces, vlist, nlist = [], [], [] for x in range(nx): for y in range(ny): vs = (coords(x,y), coords(x+1,y), coords(x+1,y+1), coords(x,y+1)) ns = (normals(x,y), normals(x+1,y), normals(x+1,y+1), normals(x,y+1)) faces.append((vs, ns)) for v in vs: vlist.extend(v) for n in ns: nlist.extend(n) for usearray in (False, True): # Initialize pygame and OpenGL pygame.init() pygame.display.set_mode((640, 480), DOUBLEBUF | OPENGL) glLightfv(GL_LIGHT0, GL_POSITION, (10,10,10, 0.0)) glLightfv(GL_LIGHT0, GL_DIFFUSE, (0.5, 0.5, 0.5, 1.0)) glEnable(GL_LIGHT0) glEnable(GL_LIGHTING) glEnable(GL_COLOR_MATERIAL) glEnable(GL_DEPTH_TEST)
Re: [pygame] Faster OBJ loader
Hi, The display list caches all operations inside of it. You are correct that some display lists take longer to call than others, as the cached data may be different. In your example, notice that you're bounding the draw code in the display list, so you're caching the fixed function or the vertex arrays. Vertex arrays are slightly faster than fixed function, even when they are cached, which is why the vertex arrays seem faster here--because you're really drawing the vertex arrays as a display list! However, if you were drawing vertex arrays normally, versus fixed function in a display list, the fixed function would be faster. In order of increasing speed: 1. fixed function 2. vertex arrays 3. display list of fixed function 4. display list of vertex arrays 5. VBO My comment was that 2 is slower than 3. However, your test program is testing 3 against 4. Usually 4 is not done--it's generally 5 if you need the flexibility of vertex arrays. Generally when saying vertex arrays, that means the program uses no display lists (i.e., 2 is meant instead of 4). Incidentally, thank you so much for providing an example! Ian
Re: [pygame] Faster OBJ loader
On Tue, Sep 28, 2010 at 3:05 PM, Devon Scott-Tunkin devon.scotttun...@gmail.com wrote: It should be noted that display lists are deprecated (but for all intents and purposes still there) in the anti-fixed function opengl 3.1+ (and perhaps completely unavailable in open gl es 2.0?). So learning vbos is probably a good idea... So, dumb question are VBOs actually implemented in PyOpenGL? I've started looking into it, and I can't find any working examples of anyone using a VBO in python. (I've found one that looks okay, but it must be for a previous version, because the function signatures don't even match up for me.) I've tried to translate examples from C with no luck. As far as I can tell, here's how you would set one up (vlist is a numpy array): gl_buffer = glGenBuffers(1) glBindBuffer(GL_ARRAY_BUFFER_ARB, gl_buffer) glBufferData(GL_ARRAY_BUFFER_ARB, vlist, GL_STATIC_DRAW) glVertexPointerd(0) The last line gives me an error: TypeError: ('cannot be converted to pointer', bound method PointerType.voidDataPointer of class 'OpenGL.arrays.arraydatatype.GLdoubleArray') So... yeah. Any working examples? Thanks, Christopher
Re: [pygame] Faster OBJ loader
Hi, Again, my glLib implements VBOs, using: from OpenGL.arrays import vbo If you need further examples, there are some examples in the PyOpenGL distribution. Ian
[pygame] Faster OBJ loader
Hi, I'm looking into modifying the well-known objloader.py on the pygame wiki: http://www.pygame.org/wiki/OBJFileLoader I would modify it to use vertex arrays. I think this could improve efficiency of loading and rendering the models, based on some tests I did a few months ago on the pyweek message board: http://www.pyweek.org/d/3066/ I wanted to ask if this work has already been done by anyone, or if there is a different existing OBJ loader that could be used as a starting point. I searched this mailing list, and it looks to me like this is the current best OBJ loader for pygame there is. Thanks, Christopher
Re: [pygame] Faster OBJ loader
Hi, On Mon, Sep 27, 2010 at 4:49 PM, Christopher Night cosmologi...@gmail.comwrote: Hi, I'm looking into modifying the well-known objloader.py on the pygame wiki: http://www.pygame.org/wiki/OBJFileLoader I would modify it to use vertex arrays. I think this could improve efficiency of loading and rendering the models, based on some tests I did a few months ago on the pyweek message board: Vertex arrays would only be marginally faster than fixed functionality for * rendering*. This version loads into display lists, which are about as fast as possible for that. You won't be able to get better rendering performance. For faster *loading*, you can try Psyco, or just resort to C. I wanted to ask if this work has already been done by anyone, or if there is a different existing OBJ loader that could be used as a starting point. I searched this mailing list, and it looks to me like this is the current best OBJ loader for pygame there is. Having searched around, I'm fairly sure that this is the simplest and best *standalone *.obj loader. However, I started with this particular loader and extensively modified it and improved it to support vertex arrays, vertex buffer objects, display lists, and fixed functionality. It also has better support for .mtl files and handles file loading and texturing more elegantly. It also calculates the tangent vectors for use in normalmapping and related techniques. It's presently integrated into glLib Reloadedhttp://www.pygame.org/project-glLib+Reloaded-1326-.html, my project, which I humbly present. The actual loader, (glLib/glLibLoadOBJ.py), is heavily tied into the rest of the library, and as such I can't support using it in other ways--but, you may find it useful. Thanks, Christopher Ian
Re: [pygame] Faster OBJ loader
On Mon, Sep 27, 2010 at 7:52 PM, Ian Mallett geometr...@gmail.com wrote: On Mon, Sep 27, 2010 at 4:49 PM, Christopher Night cosmologi...@gmail.com wrote: Hi, I'm looking into modifying the well-known objloader.py on the pygame wiki: http://www.pygame.org/wiki/OBJFileLoader I would modify it to use vertex arrays. I think this could improve efficiency of loading and rendering the models, based on some tests I did a few months ago on the pyweek message board: Vertex arrays would only be marginally faster than fixed functionality for *rendering*. This version loads into display lists, which are about as fast as possible for that. You won't be able to get better rendering performance. Thanks for the response! Assuming that rendering means what I think it means (actually drawing the thing on the screen, right?), I was able to improve the rendering performance significantly using vertex arrays in the test I did a few months ago. I was still using a display list as well, but I greatly reduced the number of GL commands within the display list. The trick was to triangulate all the faces, and render all the faces for a given material using a single call to glDrawArrays(GL_TRIANGLES...). I realize this is hardware-dependent, but the speedup was dramatic on 2 out of 2 systems that I've tried. Maybe it's not the vertex arrays that matter, and triangulating all the faces and using a single call to glBegin(GL_TRIANGLES) would yield the same speedup. Either way, it's worth looking into I think Improved loading performance comes from the fact that these arrays can be pickled, so you don't have to read them directly from the OBJ file after the first time. You only need to distribute the pickled OBJ/MTL files with your actual game. Psyco or C might be just as good, but this solution was pretty simple, and reduces the (subsequent) load times to almost nothing. I'm very interested in any more feedback you have on this! I'm a real beginner when it comes to OpenGL stuff! I started with this particular loader and extensively modified it and improved it to support vertex arrays, vertex buffer objects, display lists, and fixed functionality. It also has better support for .mtl files and handles file loading and texturing more elegantly. It also calculates the tangent vectors for use in normalmapping and related techniques. It's presently integrated into glLib Reloadedhttp://www.pygame.org/project-glLib+Reloaded-1326-.html, my project, which I humbly present. The actual loader, (glLib/glLibLoadOBJ.py), is heavily tied into the rest of the library, and as such I can't support using it in other ways--but, you may find it useful. Great, thank you so much! I'll definitely have a look! -Christopher