Re: [pygame] Faster OBJ loader

2010-09-29 Thread René Dudfield
ps.  if the obj has multiple objects using the same textures/materials, then
only loading that material once and sharing it can give big speedups.  Not
generally the case in simple single models, but more so if it is a really
big scene.


Re: [pygame] Faster OBJ loader

2010-09-28 Thread Ian Mallett
On Mon, Sep 27, 2010 at 8:05 PM, Christopher Night
cosmologi...@gmail.comwrote:

 Thanks for the response! Assuming that rendering means what I think it
 means (actually drawing the thing on the screen, right?),

Absolutely.

 I was able to improve the rendering performance significantly using vertex
 arrays in the test I did a few months ago. I was still using a display list
 as well, but I greatly reduced the number of GL commands within the display
 list. The trick was to triangulate all the faces, and render all the faces
 for a given material using a single call to glDrawArrays(GL_TRIANGLES...). I
 realize this is hardware-dependent, but the speedup was dramatic on 2 out of
 2 systems that I've tried. Maybe it's not the vertex arrays that matter, and
 triangulating all the faces and using a single call to glBegin(GL_TRIANGLES)
 would yield the same speedup. Either way, it's worth looking into I
 think

You must not be using the calls as you think you are, then.  Graphics cards
have a graphics bus that handles data transfer to and from the card.
Unfortunately, this graphics bus is slower than either the CPU or the GPU
themselves.

Fixed function (glVertex3f(...), glNormal3f(...), etc.) sends the data to
the card on each call.  So, if you have 900 such calls, you send 900 state
changes to OpenGL across the graphics bus each time the data is drawn.  This
can get slow.

Vertex arrays work similarly, except the data is stored as an array, and the
equivalent of the 900 fixed function calls are sent across the graphics bus
each frame.  Although this batched approach is faster than fixed function,
all the data is still transferred to the card each time the data is drawn.

Display lists work by caching operations on the graphics card.  You can
specify nearly anything inside a display list, including fixed function
(and I think) vertex arrays.  To use display lists, you wrap the drawing
code in glGenLists()/glNewList() and glEndList() calls.  The code inside,
after being transferred to the GPU, is stored for later use.  Later, you can
call glCallLists(), with the appropriate list argument.  The list's NUMBER
is transferred to the GPU, and the relevant set of cached operations is
executed.  The practical upshot of all this is that each time you draw the
object, you pass a single number to the graphics card, and the appropriate
cached operations are executed.  This is the way the Wiki .obj loader
works.

Vertex Buffer Objects are a more advanced topic, but they work by caching
vertex arrays on the GPU.

 Improved loading performance comes from the fact that these arrays can be
 pickled, so you don't have to read them directly from the OBJ file after the
 first time. You only need to distribute the pickled OBJ/MTL files with your
 actual game. Psyco or C might be just as good, but this solution was pretty
 simple, and reduces the (subsequent) load times to almost nothing.

This is actually a fantastic idea; I love it!

 -Christopher

Ian


Re: [pygame] Faster OBJ loader

2010-09-28 Thread Devon Scott-Tunkin
It should be noted that display lists are deprecated (but for all intents
and purposes still there) in the anti-fixed function opengl 3.1+ (and
perhaps completely unavailable in open gl es 2.0?). So learning vbos is
probably a good idea, and like display lists it also means you aren't
passing vertex arrays over the graphics bus on every draw call because the
vertex arrays are stored on the graphics card. The advantage of vbos over
display lists is you can store static or dynamic data, the disadvantage is
you can only store certain kinds of data and not operations, so on some
cards display lists are still faster than vbos.

Devon

On Tue, Sep 28, 2010 at 1:05 PM, Ian Mallett geometr...@gmail.com wrote:

 On Mon, Sep 27, 2010 at 8:05 PM, Christopher Night cosmologi...@gmail.com
  wrote:

 Thanks for the response! Assuming that rendering means what I think it
 means (actually drawing the thing on the screen, right?),

 Absolutely.

 I was able to improve the rendering performance significantly using vertex
 arrays in the test I did a few months ago. I was still using a display list
 as well, but I greatly reduced the number of GL commands within the display
 list. The trick was to triangulate all the faces, and render all the faces
 for a given material using a single call to glDrawArrays(GL_TRIANGLES...). I
 realize this is hardware-dependent, but the speedup was dramatic on 2 out of
 2 systems that I've tried. Maybe it's not the vertex arrays that matter, and
 triangulating all the faces and using a single call to glBegin(GL_TRIANGLES)
 would yield the same speedup. Either way, it's worth looking into I
 think

 You must not be using the calls as you think you are, then.  Graphics cards
 have a graphics bus that handles data transfer to and from the card.
 Unfortunately, this graphics bus is slower than either the CPU or the GPU
 themselves.

 Fixed function (glVertex3f(...), glNormal3f(...), etc.) sends the data to
 the card on each call.  So, if you have 900 such calls, you send 900 state
 changes to OpenGL across the graphics bus each time the data is drawn.  This
 can get slow.

 Vertex arrays work similarly, except the data is stored as an array, and
 the equivalent of the 900 fixed function calls are sent across the graphics
 bus each frame.  Although this batched approach is faster than fixed
 function, all the data is still transferred to the card each time the data
 is drawn.

 Display lists work by caching operations on the graphics card.  You can
 specify nearly anything inside a display list, including fixed function
 (and I think) vertex arrays.  To use display lists, you wrap the drawing
 code in glGenLists()/glNewList() and glEndList() calls.  The code inside,
 after being transferred to the GPU, is stored for later use.  Later, you can
 call glCallLists(), with the appropriate list argument.  The list's NUMBER
 is transferred to the GPU, and the relevant set of cached operations is
 executed.  The practical upshot of all this is that each time you draw the
 object, you pass a single number to the graphics card, and the appropriate
 cached operations are executed.  This is the way the Wiki .obj loader
 works.

 Vertex Buffer Objects are a more advanced topic, but they work by caching
 vertex arrays on the GPU.

 Improved loading performance comes from the fact that these arrays can be
 pickled, so you don't have to read them directly from the OBJ file after the
 first time. You only need to distribute the pickled OBJ/MTL files with your
 actual game. Psyco or C might be just as good, but this solution was pretty
 simple, and reduces the (subsequent) load times to almost nothing.

 This is actually a fantastic idea; I love it!

  -Christopher

 Ian



Re: [pygame] Faster OBJ loader

2010-09-28 Thread Ian Mallett
On Tue, Sep 28, 2010 at 1:05 PM, Devon Scott-Tunkin 
devon.scotttun...@gmail.com wrote:

 It should be noted that display lists are deprecated (but for all intents
 and purposes still there) in the anti-fixed function opengl 3.1+ (and
 perhaps completely unavailable in open gl es 2.0?). So learning vbos is
 probably a good idea, and like display lists it also means you aren't
 passing vertex arrays over the graphics bus on every draw call because the
 vertex arrays are stored on the graphics card. The advantage of vbos over
 display lists is you can store static or dynamic data, the disadvantage is
 you can only store certain kinds of data and not operations, so on some
 cards display lists are still faster than vbos.

Correct.  Personally, I think display lists are so ubiquitous that they
aren't going away for decades.  I agree that VBOs are the future though.
They also have the advantage that the program field of vertex attributes can
be dynamically altered.  This is important for programmable shading,
particularly, when you want to use potentially more than one shader to draw
a single object.


Re: [pygame] Faster OBJ loader

2010-09-28 Thread Greg Ewing

Ian Mallett wrote:

Correct.  Personally, I think display lists are so ubiquitous that they 
aren't going away for decades.  I agree that VBOs are the future though. 


Note that display lists and VBOs don't have to be mutually
exclusive. A display list could contain instructions to draw
from VBOs.

--
Greg


Re: [pygame] Faster OBJ loader

2010-09-28 Thread Christopher Night
On Tue, Sep 28, 2010 at 2:05 PM, Ian Mallett geometr...@gmail.com wrote:

 On Mon, Sep 27, 2010 at 8:05 PM, Christopher Night cosmologi...@gmail.com
  wrote:

 I was able to improve the rendering performance significantly using vertex
 arrays in the test I did a few months ago. I was still using a display list
 as well, but I greatly reduced the number of GL commands within the display
 list. The trick was to triangulate all the faces, and render all the faces
 for a given material using a single call to glDrawArrays(GL_TRIANGLES...). I
 realize this is hardware-dependent, but the speedup was dramatic on 2 out of
 2 systems that I've tried. Maybe it's not the vertex arrays that matter, and
 triangulating all the faces and using a single call to glBegin(GL_TRIANGLES)
 would yield the same speedup. Either way, it's worth looking into I
 think

 You must not be using the calls as you think you are, then.  Graphics cards
 have a graphics bus that handles data transfer to and from the card.
 Unfortunately, this graphics bus is slower than either the CPU or the GPU
 themselves.

 Fixed function (glVertex3f(...), glNormal3f(...), etc.) sends the data to
 the card on each call.  So, if you have 900 such calls, you send 900 state
 changes to OpenGL across the graphics bus each time the data is drawn.  This
 can get slow.

 Vertex arrays work similarly, except the data is stored as an array, and
 the equivalent of the 900 fixed function calls are sent across the graphics
 bus each frame.  Although this batched approach is faster than fixed
 function, all the data is still transferred to the card each time the data
 is drawn.

 Display lists work by caching operations on the graphics card.  You can
 specify nearly anything inside a display list, including fixed function
 (and I think) vertex arrays.  To use display lists, you wrap the drawing
 code in glGenLists()/glNewList() and glEndList() calls.  The code inside,
 after being transferred to the GPU, is stored for later use.  Later, you can
 call glCallLists(), with the appropriate list argument.  The list's NUMBER
 is transferred to the GPU, and the relevant set of cached operations is
 executed.  The practical upshot of all this is that each time you draw the
 object, you pass a single number to the graphics card, and the appropriate
 cached operations are executed.  This is the way the Wiki .obj loader works.


Excellent, thank you very much for the explanation. You're absolutely right
that I'm likely to not be using the calls like I think I am. :-)

I understand now that no matter what's in the display list, only a single
number is passed to the graphics card, so there can't be any optimization on
the outside. However, wouldn't it be possible for some display lists to
execute faster within the graphics card than others?

Attached below is a script that demonstrates what I'm talking about. It
should render a torus repeatedly for 60 seconds. First without vertex arrays
(the way the objloader on the wiki does it), and second with vertex arrays.
For me the output is:

Without arrays: 58.0fps
With arrays: 140.0fps

It takes several minutes to run, because the torus has a huge number of
faces it has to generate. I had to do that to get the framerate down.
Anyway, this is the kind of test that suggests to me that vertex arrays
might help. Do you see something wrong with it?

As for VBOs, I know I should learn them. If I can figure them out, and I can
get the same performance from them, that would be preferable. However,
that's just for the sake of using non-deprecated techniques: for an OBJ
loader, there wouldn't seem to be much need for dynamic data.

-Christopher


import pygame
from pygame.locals import *
from math import sin, cos, pi
from OpenGL.GL import *
from OpenGL.GLU import *

tmax = 60.  # Time to run each test for

# Generate the torus faces
nx, ny = 600, 400
xcos = [cos(x * 2 * pi / nx) for x in range(nx+1)]
xsin = [sin(x * 2 * pi / nx) for x in range(nx+1)]
ycos = [cos(y * 2 * pi / ny) for y in range(ny+1)]
ysin = [sin(y * 2 * pi / ny) for y in range(ny+1)]

def coords(x, y):
return xcos[x] * (2 + ycos[y]), ysin[y], xsin[x] * (2 + ycos[y])
def normals(x, y):
return xcos[x] * ycos[y], ysin[y], xsin[x] * ycos[y]
faces, vlist, nlist = [], [], []
for x in range(nx):
for y in range(ny):
vs = (coords(x,y), coords(x+1,y), coords(x+1,y+1), coords(x,y+1))
ns = (normals(x,y), normals(x+1,y), normals(x+1,y+1),
normals(x,y+1))
faces.append((vs, ns))
for v in vs: vlist.extend(v)
for n in ns: nlist.extend(n)

for usearray in (False, True):

# Initialize pygame and OpenGL
pygame.init()
pygame.display.set_mode((640, 480), DOUBLEBUF | OPENGL)

glLightfv(GL_LIGHT0, GL_POSITION,  (10,10,10, 0.0))
glLightfv(GL_LIGHT0, GL_DIFFUSE, (0.5, 0.5, 0.5, 1.0))
glEnable(GL_LIGHT0)
glEnable(GL_LIGHTING)
glEnable(GL_COLOR_MATERIAL)
glEnable(GL_DEPTH_TEST)

Re: [pygame] Faster OBJ loader

2010-09-28 Thread Ian Mallett
Hi,

The display list caches all operations inside of it.  You are correct that
some display lists take longer to call than others, as the cached data may
be different.

In your example, notice that you're bounding the draw code in the display
list, so you're caching the fixed function or the vertex arrays.  Vertex
arrays are slightly faster than fixed function, even when they are cached,
which is why the vertex arrays seem faster here--because you're really
drawing the vertex arrays as a display list!  However, if you were drawing
vertex arrays normally, versus fixed function in a display list, the fixed
function would be faster.

In order of increasing speed:
1. fixed function
2. vertex arrays
3. display list of fixed function
4. display list of vertex arrays
5. VBO

My comment was that 2 is slower than 3.  However, your test program is
testing 3 against 4.  Usually 4 is not done--it's generally 5 if you need
the flexibility of vertex arrays.  Generally when saying vertex arrays,
that means the program uses no display lists (i.e., 2 is meant instead of
4).

Incidentally, thank you so much for providing an example!

Ian


Re: [pygame] Faster OBJ loader

2010-09-28 Thread Christopher Night
On Tue, Sep 28, 2010 at 3:05 PM, Devon Scott-Tunkin 
devon.scotttun...@gmail.com wrote:

 It should be noted that display lists are deprecated (but for all intents
 and purposes still there) in the anti-fixed function opengl 3.1+ (and
 perhaps completely unavailable in open gl es 2.0?). So learning vbos is
 probably a good idea...


So, dumb question are VBOs actually implemented in PyOpenGL? I've
started looking into it, and I can't find any working examples of anyone
using a VBO in python. (I've found one that looks okay, but it must be for a
previous version, because the function signatures don't even match up for
me.) I've tried to translate examples from C with no luck. As far as I can
tell, here's how you would set one up (vlist is a numpy array):

gl_buffer = glGenBuffers(1)
glBindBuffer(GL_ARRAY_BUFFER_ARB, gl_buffer)
glBufferData(GL_ARRAY_BUFFER_ARB, vlist, GL_STATIC_DRAW)
glVertexPointerd(0)

The last line gives me an error:

TypeError: ('cannot be converted to pointer', bound method
PointerType.voidDataPointer of class
'OpenGL.arrays.arraydatatype.GLdoubleArray')

So... yeah. Any working examples?

Thanks,
Christopher


Re: [pygame] Faster OBJ loader

2010-09-28 Thread Ian Mallett
Hi,

Again, my glLib implements VBOs, using:

from OpenGL.arrays import vbo

If you need further examples, there are some examples in the PyOpenGL
distribution.

Ian


[pygame] Faster OBJ loader

2010-09-27 Thread Christopher Night
Hi, I'm looking into modifying the well-known objloader.py on the pygame
wiki:

http://www.pygame.org/wiki/OBJFileLoader

I would modify it to use vertex arrays. I think this could improve
efficiency of loading and rendering the models, based on some tests I did a
few months ago on the pyweek message board:

http://www.pyweek.org/d/3066/

I wanted to ask if this work has already been done by anyone, or if there is
a different existing OBJ loader that could be used as a starting point. I
searched this mailing list, and it looks to me like this is the current best
OBJ loader for pygame there is.

Thanks,
Christopher


Re: [pygame] Faster OBJ loader

2010-09-27 Thread Ian Mallett
Hi,
On Mon, Sep 27, 2010 at 4:49 PM, Christopher Night
cosmologi...@gmail.comwrote:

 Hi, I'm looking into modifying the well-known objloader.py on the pygame
 wiki:

 http://www.pygame.org/wiki/OBJFileLoader

 I would modify it to use vertex arrays. I think this could improve
 efficiency of loading and rendering the models, based on some tests I did a
 few months ago on the pyweek message board:

Vertex arrays would only be marginally faster than fixed functionality for *
rendering*.  This version loads into display lists, which are about as fast
as possible for that.  You won't be able to get better rendering
performance.

For faster *loading*, you can try Psyco, or just resort to C.

 I wanted to ask if this work has already been done by anyone, or if there
 is a different existing OBJ loader that could be used as a starting point. I
 searched this mailing list, and it looks to me like this is the current best
 OBJ loader for pygame there is.

Having searched around, I'm fairly sure that this is the simplest and
best *standalone
*.obj loader.

However, I started with this particular loader and extensively modified it
and improved it to support vertex arrays, vertex buffer objects, display
lists, and fixed functionality.  It also has better support for .mtl files
and handles file loading and texturing more elegantly.  It also calculates
the tangent vectors for use in normalmapping and related techniques.

It's presently integrated into glLib
Reloadedhttp://www.pygame.org/project-glLib+Reloaded-1326-.html,
my project, which I humbly present.  The actual loader,
(glLib/glLibLoadOBJ.py), is heavily tied into the rest of the library, and
as such I can't support using it in other ways--but, you may find it useful.

 Thanks,
 Christopher

Ian


Re: [pygame] Faster OBJ loader

2010-09-27 Thread Christopher Night
On Mon, Sep 27, 2010 at 7:52 PM, Ian Mallett geometr...@gmail.com wrote:


 On Mon, Sep 27, 2010 at 4:49 PM, Christopher Night cosmologi...@gmail.com
  wrote:

 Hi, I'm looking into modifying the well-known objloader.py on the pygame
 wiki:

 http://www.pygame.org/wiki/OBJFileLoader

 I would modify it to use vertex arrays. I think this could improve
 efficiency of loading and rendering the models, based on some tests I did a
 few months ago on the pyweek message board:

 Vertex arrays would only be marginally faster than fixed functionality for
 *rendering*.  This version loads into display lists, which are about as
 fast as possible for that.  You won't be able to get better rendering
 performance.

 Thanks for the response! Assuming that rendering means what I think it
means (actually drawing the thing on the screen, right?), I was able to
improve the rendering performance significantly using vertex arrays in the
test I did a few months ago. I was still using a display list as well, but I
greatly reduced the number of GL commands within the display list. The trick
was to triangulate all the faces, and render all the faces for a given
material using a single call to glDrawArrays(GL_TRIANGLES...). I realize
this is hardware-dependent, but the speedup was dramatic on 2 out of 2
systems that I've tried. Maybe it's not the vertex arrays that matter, and
triangulating all the faces and using a single call to glBegin(GL_TRIANGLES)
would yield the same speedup. Either way, it's worth looking into I
think

Improved loading performance comes from the fact that these arrays can be
pickled, so you don't have to read them directly from the OBJ file after the
first time. You only need to distribute the pickled OBJ/MTL files with your
actual game. Psyco or C might be just as good, but this solution was pretty
simple, and reduces the (subsequent) load times to almost nothing.

I'm very interested in any more feedback you have on this! I'm a real
beginner when it comes to OpenGL stuff!


 I started with this particular loader and extensively modified it and
 improved it to support vertex arrays, vertex buffer objects, display lists,
 and fixed functionality.  It also has better support for .mtl files and
 handles file loading and texturing more elegantly.  It also calculates the
 tangent vectors for use in normalmapping and related techniques.

 It's presently integrated into glLib 
 Reloadedhttp://www.pygame.org/project-glLib+Reloaded-1326-.html,
 my project, which I humbly present.  The actual loader,
 (glLib/glLibLoadOBJ.py), is heavily tied into the rest of the library, and
 as such I can't support using it in other ways--but, you may find it useful.


Great, thank you so much! I'll definitely have a look!

-Christopher