date:20080417

Re: [pygame] Python and Speed

2008-04-17 Thread Jason Ward

The way I speed up my python code is Ctypes.
I just make a dll file in C or asm and then call it with Ctypes and presto.
I have tons of speed at my fingertips.

Just my 2 cents :)

Re: [pygame] Python and Speed

2008-04-17 Thread René Dudfield

On Thu, Apr 17, 2008 at 2:21 PM, Greg Ewing [EMAIL PROTECTED] wrote:
 René Dudfield wrote:


  2. - asm optimizations.  There seems to be
 
  almost no asm optimizations in CPython.
 

  That's a deliberate policy. One of the goals of CPython
  is to be very portable and written in a very straightforward
  way. Including special pieces of asm for particular
  architectures isn't usually considered worth the
  maintenance effort required.


Other, more portable, software has proved this to be somewhat wrong I
think.  Optional asm software is used in a lot of software today to
good effect.  Also this decision was made a while ago, and things have
changed since then I think.

- python now has unittests.  So testing that the asm code works, and
keeps working correctly is much easier.
- x86 is now very common.  Most mainstream server, and desktops use
x86.  So just targeting x86 gives you a lot more benefit now.
- SIMD instructions are the fast ones... so you don't actually have to
learn all that much to write fast asm - you only have to learn a
subset of asm.  You can get compilers to generate the first pass of
the function, and then modify it.  Of course writing the fastest
possible asm still requires effort - but it is fairly easy for a
novice to beat a compiler with SIMD code.
- advanced compilers can generate asm, which can then be used by worse
compilers.  eg, the intel compiler, or vectorc compiler can be used to
generate asm, and then be included into C code compiled by gcc.
- libraries of fast, tested asm code are available.  eg, from amd,
intel and others.
- python, and FOSS now has a much larger development community with
more asm experts.




  CPython could use faster threading
  primitives, and more selective releasing of the GIL.
 

  Everyone would love to get rid of the GIL as well, but
  that's another Very Hard Problem about which there has
  been much discussion, but little in the way of workable
  ideas.


Yeah, not getting rid of the GIL entirely - but selectively releasing
it.  As an example, pygame releases the GIL around certain C
functionality like pygame.transform.scale.
Freebsd, and linux have also followed this method - adding more fine
grained locking where it is worth it - and improving their threading
primitives.  I think there has been work already in fixing a lot of
python threading issues in the last year - but there's lots more to
do.

I'm using python on 8 core machines for my work loads just fine today.


  A way to know how much memory is being used.
  Memory profiling is the most important way to optimize since memory
  is quite slow compared to the speed of the cpu.
 

  Yes, but amount of memory used doesn't necessarily
  have anything to do with rate of memory accesses.
  Locality of reference, so that things stay in the
  cache, is more important.


If you are using 200 bytes for each int, then you can quickly process
50x less data than an int that takes up 4 bytes.

If you have 1 gig of available memory, and say kjDict uses up half the
memory as a normal dict, a normal dict would use up 2gigs, and your
kjDict will use up 1gig.  In this case the kjDict would be massively
faster than a normal dict because of swapping.

I think memory is one of the most important areas in optimising a
program these days.  So python should provide tools to help measure
memory use(how much memory things use, and how things are allocating
memory).




  perhaps releasing
  a patch with a few selected asm optimizations might let the python
  developers realise how much faster python could be...
 

  Have you actually tried any of this? Measurement
  would be needed to tell whether these things address
  any of the actual bottlenecks in CPython.


You can try it easily yourself - compile python with machine specific
optimisations(eg add -mtune=athlon to your gcc arguments).  You can
run this python binary, and get faster benchmarks.  This provides the
proof that more optimised assembly can run faster.

Also the link I gave to a commonly used memcpy function running 5x
faster should provide you with another proof of the possibilities.
Other software being sped up by asm optimisation provides another
proof(including SDL, pygame, linux etc).  The Pawn language's virtual
machine written in nasm is lots faster than the version written in C -
which provides another proof.  Psyco is another proof that asm can
speed up python(psyco is a run time assembler).

The idea is you only optimise key stable functions in asm - not everything.

For example in SDL the blit functions are written in asm - with C
implementations.  It's using the best tool for the job: Python for the
highest level, then C then asm.

I think a patch for CPython would need to be made with benchmarks as a
proper proof though - but hopefully the list above provides
theoretical proof to you that adding asm optimisations would speed up
CPython.



However the recompilation with cpu specific compiler flags would only need:
- cpu

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread Marcus von Appen

On, Wed Apr 16, 2008, Lenard Lindstrom wrote:

 I was trying out BLEND_ADD with blit to see if an alpha plane could be 
 added to a surface with alpha's zero. Here is the trial session:
[...]

Both surfaces contain alpha, so only alpha blending will be applied. The
BLEND_* args only work for cases where alpha blending cannot be
applied.
Should we change this to take alpha into account for the BLEND_* args as
well and only apply the default alpha blending, if no BLEND_* args are
passed?

Regards
Marcus


pgp1XeRMkhqgb.pgp
Description: PGP signature

Re: Re: [pygame] Python and Speed

2008-04-17 Thread Richard Jones

 RenÃ© Dudfield [EMAIL PROTECTED] wrote:
 On Thu, Apr 17, 2008 at 2:21 PM, Greg Ewing 
 [EMAIL PROTECTED] wrote:
  RenÃ© Dudfield wrote:
 
 
   2. - asm optimizations.  There seems to be
  
   almost no asm optimizations in CPython.
  
 
   That's a deliberate policy. One of the goals of CPython
   is to be very portable and written in a very straightforward
   way. Including special pieces of asm for particular
   architectures isn't usually considered worth the
   maintenance effort required.
 
 
 Other, more portable, software has proved this to be somewhat wrong I
 think. 

I think this is the wrong forum to be having this discussion :)


 Richard

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread Brian Fisher

On Wed, Apr 16, 2008 at 11:48 PM, Marcus von Appen [EMAIL PROTECTED] wrote:
  Both surfaces contain alpha, so only alpha blending will be applied. The
  BLEND_* args only work for cases where alpha blending cannot be
  applied.
  Should we change this to take alpha into account for the BLEND_* args as
  well and only apply the default alpha blending, if no BLEND_* args are
  passed?

I think if somebody passes the BLEND args, their intent is pretty
clear that they want you to respect the blend args passed, regardless
of whether or not the surface has alpha or not. If you pass BLEND args
and they don't change the operation at all, that seems like useless
behavior for the api.

... that being said, I absolutely think the end result of Lenard's
test is right - specifically if I do additive blend with a source that
has alpha of zero, the image should be unchanged, because RGBA is a
color+transparency mode in other contexts (i.e. an alpha of 0 means do
nothing). In terms of art production, alpha'ing out pixels is a great
way to make art that is a particular color but only lightly applied.
Also, it can be useful to render the same art with both multiplicative
and additive blending to get a glowy effect that doesn't saturate over
white.

Re: Re: [pygame] Python and Speed

2008-04-17 Thread FT


Hi!

No, this is the place to discuss it because if we wish to make games,
work with existing platforms, and want speed, that is the way to go. Now
that we have had this discussion, and found solutions, now we have a list of
ways to resolve it.

This is the place to discuss all of this and brings to the front the
issues of speed, connections, and over all solutions. The only way to make
Pygame better, faster and competitive with the world...

Just like the question I had with the tts, text to speech, even though I
do not use the video end yet, I do use the sound end. So Ian and speed is a
very good question and take a look at what came of it below.

I learn by doing, examples help because I get to use, tweak, and
eventually like many have done, come up with a better solution or firm
conclusion.

For I now understand adding options into my setup.py file or now I call
it setup4tts.py or anything for any need...

Bruce

From: Richard Jones
I think this is the wrong forum to be having this discussion :)

 Richard


From: Jason Ward


The way I speed up my python code is Ctypes.
I just make a dll file in C or asm and then call it with Ctypes and presto.
I have tons of speed at my fingertips.

Just my 2 cents :)



On Thu, Apr 17, 2008 at 2:21 PM, Greg Ewing [EMAIL PROTECTED]
wrote:
 René Dudfield wrote:


  2. - asm optimizations.  There seems to be
 
  almost no asm optimizations in CPython.
 

  That's a deliberate policy. One of the goals of CPython
  is to be very portable and written in a very straightforward
  way. Including special pieces of asm for particular
  architectures isn't usually considered worth the
  maintenance effort required.


Other, more portable, software has proved this to be somewhat wrong I
think.  Optional asm software is used in a lot of software today to
good effect.  Also this decision was made a while ago, and things have
changed since then I think.

- python now has unittests.  So testing that the asm code works, and
keeps working correctly is much easier.
- x86 is now very common.  Most mainstream server, and desktops use
x86.  So just targeting x86 gives you a lot more benefit now.
- SIMD instructions are the fast ones... so you don't actually have to
learn all that much to write fast asm - you only have to learn a
subset of asm.  You can get compilers to generate the first pass of
the function, and then modify it.  Of course writing the fastest
possible asm still requires effort - but it is fairly easy for a
novice to beat a compiler with SIMD code.
- advanced compilers can generate asm, which can then be used by worse
compilers.  eg, the intel compiler, or vectorc compiler can be used to
generate asm, and then be included into C code compiled by gcc.
- libraries of fast, tested asm code are available.  eg, from amd,
intel and others.
- python, and FOSS now has a much larger development community with
more asm experts.




  CPython could use faster threading
  primitives, and more selective releasing of the GIL.
 

  Everyone would love to get rid of the GIL as well, but
  that's another Very Hard Problem about which there has
  been much discussion, but little in the way of workable
  ideas.


Yeah, not getting rid of the GIL entirely - but selectively releasing
it.  As an example, pygame releases the GIL around certain C
functionality like pygame.transform.scale.
Freebsd, and linux have also followed this method - adding more fine
grained locking where it is worth it - and improving their threading
primitives.  I think there has been work already in fixing a lot of
python threading issues in the last year - but there's lots more to
do.

I'm using python on 8 core machines for my work loads just fine today.


  A way to know how much memory is being used.
  Memory profiling is the most important way to optimize since memory
  is quite slow compared to the speed of the cpu.
 

  Yes, but amount of memory used doesn't necessarily
  have anything to do with rate of memory accesses.
  Locality of reference, so that things stay in the
  cache, is more important.


If you are using 200 bytes for each int, then you can quickly process
50x less data than an int that takes up 4 bytes.

If you have 1 gig of available memory, and say kjDict uses up half the
memory as a normal dict, a normal dict would use up 2gigs, and your
kjDict will use up 1gig.  In this case the kjDict would be massively
faster than a normal dict because of swapping.

I think memory is one of the most important areas in optimising a
program these days.  So python should provide tools to help measure
memory use(how much memory things use, and how things are allocating
memory).




  perhaps releasing
  a patch with a few selected asm optimizations might let the python
  developers realise how much faster python could be...
 

  Have you actually tried any of this? Measurement
  would be needed to tell whether these things address
  any of the actual bottlenecks in CPython.


You

Re: Re: [pygame] Python and Speed

2008-04-17 Thread Adam Bark

On Thu, Apr 17, 2008 at 2:21 PM, Greg Ewing [EMAIL PROTECTED]

 wrote:
  René Dudfield wrote:
 
 
   2. - asm optimizations.  There seems to be
  
   almost no asm optimizations in CPython.
  
 
   That's a deliberate policy. One of the goals of CPython
   is to be very portable and written in a very straightforward
   way. Including special pieces of asm for particular
   architectures isn't usually considered worth the
   maintenance effort required.
 

 Other, more portable, software has proved this to be somewhat wrong I

 think.  Optional asm software is used in a lot of software today to
 good effect.  Also this decision was made a while ago, and things have
 changed since then I think.

 - python now has unittests.  So testing that the asm code works, and
 keeps working correctly is much easier.
 - x86 is now very common.  Most mainstream server, and desktops use
 x86.  So just targeting x86 gives you a lot more benefit now.
 - SIMD instructions are the fast ones... so you don't actually have to
 learn all that much to write fast asm - you only have to learn a
 subset of asm.  You can get compilers to generate the first pass of
 the function, and then modify it.  Of course writing the fastest
 possible asm still requires effort - but it is fairly easy for a
 novice to beat a compiler with SIMD code.
 - advanced compilers can generate asm, which can then be used by worse
 compilers.  eg, the intel compiler, or vectorc compiler can be used to
 generate asm, and then be included into C code compiled by gcc.
 - libraries of fast, tested asm code are available.  eg, from amd,
 intel and others.
 - python, and FOSS now has a much larger development community with
 more asm experts.


I see a slight problem with these architecture specific optimisations, once
you
release your game out onto the general public, even if you were using super
optimized awesomeness and everything ran fine, the majority of people will
be running whatever version of the library came with there OS rather
compiling
there own and it might be somewhat lacking.

Re: [pygame] [BUG] pixelarray_test.py reports errors then hangs for Python 2.4.4

2008-04-17 Thread Marcus von Appen

On, Wed Apr 16, 2008, Lenard Lindstrom wrote:

 Marcus von Appen wrote:
 On, Wed Apr 16, 2008, Rene Dudfield wrote:
 
   
 hey,
 
 you can make a unittest, and then commit to svn.
 
 
 If you tell me what to test for ;-). The bug sounds more like
 something happening elsewhere mashing up some memory adresses or so and
 causing later tests to fail.
 
 Regards
 Marcus
   
 Sorry, this BUG is a false alarm. When I recompiled Pygame SVN I didn't 
 install it for Python 2.4. So the Pygame 1.8.1pre tests were run with 
 Pygame 1.8.0release. The tests pass for Pygame 1.8.1pre and Python 2.4.

Not necessarily. Though the crash might have been caused by it as well,
I discovered another bug in the subscript implementation, which might
try to write on memory outside of the pixel memory. Fixed in rev. 1220.

Regards
Marcus


pgpymEz5qbHie.pgp
Description: PGP signature

Re: Re: [pygame] Python and Speed

2008-04-17 Thread Kevin


 I see a slight problem with these architecture specific optimisations,
 once you
 release your game out onto the general public, even if you were using
 super
 optimized awesomeness and everything ran fine, the majority of people will
 be running whatever version of the library came with there OS rather
 compiling
 there own and it might be somewhat lacking.


That's a good point; so everyone should run Gentoo! ;) As for memory
management, whatever became of PyMalloc? I haven't heard much about it in a
good while.

-- 
This, from Jach.

Re: [pygame] Python and Speed

2008-04-17 Thread Casey Duncan


On Apr 16, 2008, at 6:36 PM, Ian Mallett wrote:
Recently, I've been having issues in all fortes of my programming  
experience where Python is no longer fast enough to fill the need  
adaquately.  I really love Python and its syntax (or lack of), and  
all the nice modules (such as PyGame) made for it.  Are there any  
plans to improve Python's speed to at least the level of C  
languages?  I feel like Python is living down to its namesake's  
pace...

Ian


Python is slow, it is an interpreted language geared toward rapid  
application development at the expense of execution speed. It is also  
designed to be highly portable, also at the potential expense of  
execution speed.


That said, much effort is put into making python perform well, and it  
is certainly possible to make extremely fast python programs when an  
algorithm is available that allows you to make clever use of data  
structures to get the job done more efficiency. And optimization is as  
much about data structures as it is about code (maybe more). Python  
comes with some tools (cProfile in particular) that allows you to  
figure out where your code is spending its time so you can speed it  
up. Note that by removing inefficient code, or choosing a better  
algorithm, it is possible to get massive speedups in Python and any  
language.


Sometimes (especially in things like graphics and simulations) there  
is no magic algorithm to make things more efficient. Many times the  
best you can do is O(N), O(N^2) or even O(2^N). In these cases, Python  
runs out of steam real fast (in fact any language runs out of steam  
fast with O(2^N) algorithms). This is especially pronounced in games  
where you have to perform operations over large arrays once per frame.  
This is typical in 3D graphics, physics and particle systems. What you  
must often do in such cases is move the inner loop of these operations  
into native code.


The easiest way to do this is to find a native library that already  
performs the operations you need. This could be a generic library  
(like numpy) or something more specific (like pygame or ode). This is  
especially easy if the library already has a Python binding, if not  
you can make one using ctypes, pyrex or C directly. The downside here  
is the additional dependancies.


If nothing exists that does what you want, you'll need to write some  
native code yourself. First identify the slow code. Make sure there is  
no way to make it fast enough within Python (is there a better  
algorithm? etc). Then move this code into an extension, written  
however you prefer. Note it is fairly easy to create an extension  
module that defines some functions in C that can be imported into  
Python (especially if they just do number crunching), the most complex  
aspect is the memory management. Tools like pyrex can automate that  
for you, but don't give you as close control over the code.


But before you do any of this, do some profiling (run your game under  
'python -m cProfile') and see what's taking up the time. How much time  
per frame do you have for game logic? Is your logic too slow or is the  
drawing too slow? Honestly, my biggest problem with pygame is not  
python being too slow, but the fill rate of SDL being too slow. I can  
solve the former with optimization and extensions, but the latter  
requires much more compromise (lower res, smaller sprites, etc). The  
hard reality is that the CPU is often too slow for graphics work no  
matter what language you use (even highly tuned SIMD asm code). That's  
why we have dedicated graphics hardware designed for the task. Right  
now SDL doesn't really take advantage of that, and that's a big  
limitation.


-Casey

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread Marcus von Appen

On, Thu Apr 17, 2008, Brian Fisher wrote:

 On Thu, Apr 17, 2008 at 10:41 AM, Lenard Lindstrom [EMAIL PROTECTED] wrote:
 
  The documentation is vague on this. It just says that BLEND_ flags are
  optional. Nothing is mentioned about them being ignored.
 
 
 Even if the docs explicitly said the special flags are ignored for surfaces
 with alpha I would still argue that it is useless and confusing behavior to
 ignore these flags in the case of being used with a surface with alpha, and
 the behavior (and docs) should be changed. If they use the flags they are
[...]

I guess we agree on that the current behaviour and docs are bad for this
case. How would I expect it to work?

BLEND_* behave like the current implementation but take the alpha into
account as well, if it exists for both surfaces. If it does only exist
for one, it will be ignored.
Thus e.g. BLEND_ADD would create the following results:

Both surfaces have alpha:
  (10, 20, 30, 40) + (20, 30, 40, 50) = (30, 50, 70, 90)

Only one has alpha:
  (10, 20, 30) + (20, 30, 40, 50) = (30, 50, 70 [, 50])

In the latter case the alpha value will stay the same, if the
destination has alpha, otherwise it will be silently ignored.

If no blending flags are passed and both surfaces have alpha, the current
alpha blending implementation will be used. Otherwise the blit will
behave as usual.

Is such a behaviour okay for anyone?

Regards
Marcus


pgpUhnjj6u1E6.pgp
Description: PGP signature

Re: [pygame] Python and Speed

2008-04-17 Thread Casey Duncan



On Apr 17, 2008, at 11:59 AM, Ian Mallett wrote:
[..]
This is precisely the problem I have run into in one of my in-dev  
games--iterating over large arrays once per frame.  Actually, it is  
basically a collision detection algorithm--I have two arrays, both  
containing 3D points.  The points in one array must be tested with  
the points in the other to see how close they are.  If they are  
close enough, there is a collision.  Naturally, this means that for  
every point in one array, the other array must be iterated through  
and the 3D pythagorean theorem performed to each tested point.


Note this is not the most efficient way to do this, using a  
partitioned space you may be able to avoid comparing most points with  
one another most of the time. To do this in 2D you could use quad- 
trees, in 3D you could use oct-trees. See: http://en.wikipedia.org/wiki/Octree


The easiest way to do this is to find a native library that already  
performs the operations you need.
This seems like a rather uncommon function--I haven't found that  
which I need.


Note ode already implements efficient 3D collision detection in naive  
code, I believe pymunk does this for 2D. pyode is a python wrapper for  
ode.


FWIW, you would get better answers to your problems by asking more  
specific questions. If you had asked How do I make collision  
detection faster? you would have gotten much better answers than  
asking How do I make Python faster?.


-Casey

Re: [pygame] Python and Speed

2008-04-17 Thread Casey Duncan


On Apr 17, 2008, at 12:15 PM, Casey Duncan wrote:
Note ode already implements efficient 3D collision detection in  
naive code, I believe pymunk does this for 2D. pyode is a python  
wrapper for ode.


heh, I meant to say native code 8^)

-Casey

Re: [pygame] Python and Speed

2008-04-17 Thread Ian Mallett

On Thu, Apr 17, 2008 at 12:15 PM, Casey Duncan [EMAIL PROTECTED] wrote:

 Note this is not the most efficient way to do this, using a partitioned
 space you may be able to avoid comparing most points with one another most
 of the time. To do this in 2D you could use quad-trees, in 3D you could use
 oct-trees. See: http://en.wikipedia.org/wiki/Octree

Yes, I've tried this, but there are issues with points being in two separate
places.  For example, if the collision radius is 5, and it is 3 away from
the edge, then all the points in the neighboring trees must be tested.

 http://en.wikipedia.org/wiki/Octree
 Note ode already implements efficient 3D collision detection in naive
 code, I believe pymunk does this for 2D. pyode is a python wrapper for ode.

I'll look into it.


 FWIW, you would get better answers to your problems by asking more
 specific questions. If you had asked How do I make collision detection
 faster? you would have gotten much better answers than asking How do I
 make Python faster?.

Well, my question was actually how can Python be made faster?  The
collision detection was an example of where it is a problem.

 -Casey

Ian

Re: [pygame] Python and Speed

2008-04-17 Thread Brian Fisher

On Wed, Apr 16, 2008 at 7:30 PM, Ian Mallett [EMAIL PROTECTED] wrote:

 Thus it falls to you as a developer to choose your implementation strategy
  wisely:

 But again, this is treating the symptoms, not the problem...


I actually think the line of thinking I read in the comment above (thinking
that I shouldn't have to optimize things, because the stupid language is
slow) is in fact a counterproductive attitude in application developement,
and misses the real point.

It would be nice of course if python ran much much faster, but it runs slow
largely because it is designed to give you the flexibility to code complex
things much easier. If you don't want that flexibility, then you need to
turn it off with pyrex and extensions and all that kind of stuff. However
sometimes that flexibility actually lets you code more efficient approaches
to begin with. Ultimately all slowness is the programmers problem, not the
tools. If a particular tool is the best to help you solve the problem, then
it should be used. With python, coolness is always on, so it's cheap to use
coolness. C++ was designed to make you not pay for anything you don't use,
which means coolness is default off, which means it's really hard to use
coolness.

...to get to brass tacks though, I've found that the majority of the real
slowness in _anything_ I code is due to the approach I take, and much less
so due to the code. For example, pairwise collision between a set of
objects. If every object needs to be checked against every object, that's an
n-squared problem. Get 1000 items, that's 1,000,000 collision checks. But
lets say I do object partitioning, or bucketing or something where I
maintain sortings of the objects in a way that lets me only check items
against ones that are close to it, and I either get log(n) partitioning or
maybe I get at most about 10 items per bucket (both very achieveable goals).
Now it means I only do about 10,000 (10*1000) collision checks for the same
real work being done.

So lets say that my python collision code takes 100 times as long as my c++
collision code - that means if I do the optimization in python, I can get
the python code to go just as fast as the C code without the optimization.
Not only that - lets say I decide I want to step stuff up to 10,000 items
with pairwise collision - now it's 100,000,000 checks vs. like say 100,000
based on the approach - now python can actually be 10 times faster.

So now the issue becomes whats the cost of writing the more efficient
approach in python code vs. writing the naive approach in c++ code. If you
think you get enough programmer benefits from working in python to make
those 2 costs equal, and the performance of either is good enough, python is
the better choice. Not only that, once you've got good approaches written in
python that are stable and you don't need the coolness/flexibility, it
becomes much easier to port the stuff to C, or pyrex it or whatever makes it
much, much faster.


On Thu, Apr 17, 2008 at 11:59 AM, Ian Mallett [EMAIL PROTECTED] wrote:

 [casey talked about complexity]
 
 This is precisely the problem I have run into in one of my in-dev
 games--iterating over large arrays once per frame.  Actually, it is
 basically a collision detection algorithm--I have two arrays, both
 containing 3D points.  The points in one array must be tested with the
 points in the other to see how close they are.  If they are close enough,
 there is a collision.  Naturally, this means that for every point in one
 array, the other array must be iterated through and the 3D pythagorean
 theorem performed to each tested point.


Sounds like your approach is O(N^2). If most points aren't close enough to
do the collision, partitioning to make it so you don't even have to do the
check will do wonders.

Re: [pygame] Python and Speed

2008-04-17 Thread Casey Duncan


On Apr 17, 2008, at 12:26 PM, Ian Mallett wrote:
On Thu, Apr 17, 2008 at 12:15 PM, Casey Duncan [EMAIL PROTECTED]  
wrote:Note this is not the most efficient way to do this, using a  
partitioned space you may be able to avoid comparing most points  
with one another most of the time. To do this in 2D you could use  
quad-trees, in 3D you could use oct-trees. See: http://en.wikipedia.org/wiki/Octree
Yes, I've tried this, but there are issues with points being in two  
separate places.  For example, if the collision radius is 5, and it  
is 3 away from the edge, then all the points in the neighboring  
trees must be tested.


Partitioned space is certainly a more complex algorithm, but so long  
as all of your points (spheres?) are not close together, it is  
usually vastly more efficient. If the partition size is optimal, than  
the vast majority of particles will not be hitting the edge of a  
partition, that will be an edge-case (pun intended). Even for those  
that are it's usually still faster than the naive O(N^2) method that  
compares every point with every other.


This algorithm is only effective if the space is large relative to the  
collision geometries and they tend not to be clumped very close  
together.


-Casey

Re: [pygame] Python and Speed

2008-04-17 Thread Ian Mallett

On Thu, Apr 17, 2008 at 12:39 PM, Brian Fisher [EMAIL PROTECTED]
wrote:

 I actually think the line of thinking I read in the comment above
 (thinking that I shouldn't have to optimize things, because the stupid
 language is slow) is in fact a counterproductive attitude in application
 developement, and misses the real point.

Obviously it is the problem of the programmer--it is the programmer who
programs his program, not Python.  Python just executes it.  But the fact
is, it makes a programmer's job easier if he has unlimited power to work
with.  Currently, I find myself having to stretch Python's limits, and, as
you say, find optimizations. Programming is fun, but rewriting code so that
you can meet a reasonable performance benchmark is not.

 It would be nice of course if python ran much much faster, but it runs
 slow largely because it is designed to give you the flexibility to code
 complex things much easier.

I like this aspect of Python--its flexibility, but I object to the lack of
speed.  I want it all--flexibility and speed.

 If you don't want that flexibility, then you need to turn it off with
 pyrex and extensions and all that kind of stuff.

I actually can't think of any situation where I would really need to do
that.

 However sometimes that flexibility actually lets you code more efficient
 approaches to begin with.

...because the code is more clear.  Better looking code usually runs faster,
because clear code allows one to see any performance sucking bugs...

 Ultimately all slowness is the programmers problem, not the tools['].

Of course.  The programmer is the one who makes the program.  The users
would complain to the programmer, not Python, and, uh, they do.

 If a particular tool is the best to help you solve the problem, then it
 should be used. With python, coolness is always on, so it's cheap to use
 coolness. C++ was designed to make you not pay for anything you don't use,
 which means coolness is default off, which means it's really hard to use
 coolness.

 ...to get to brass tacks though, I've found that the majority of the real
 slowness in _anything_ I code is due to the approach I take, and much less
 so due to the code. For example, pairwise collision between a set of
 objects. If every object needs to be checked against every object, that's an
 n-squared problem. Get 1000 items, that's 1,000,000 collision checks. But
 lets say I do object partitioning, or bucketing or something where I
 maintain sortings of the objects in a way that lets me only check items
 against ones that are close to it, and I either get log(n) partitioning or
 maybe I get at most about 10 items per bucket (both very achieveable goals).
 Now it means I only do about 10,000 (10*1000) collision checks for the same
 real work being done.

This is work that must be done.  To do this in my case would be somewhat
complicated, as I would need interpartition testing, boundary testing on the
partitions and on each point, and various other modifications.  Of course
the code could be made faster, but this is something that I would have to do
to get this program functioning at a good speed.  Why not make Python
faster, making such annoying modifications unnecessary and speeding up all
Python in the process?

 So lets say that my python collision code takes 100 times as long as my
 c++ collision code - that means if I do the optimization in python, I can
 get the python code to go just as fast as the C code without the
 optimization. Not only that - lets say I decide I want to step stuff up to
 10,000 items with pairwise collision - now it's 100,000,000 checks vs. like
 say 100,000 based on the approach - now python can actually be 10 times
 faster.

That's an optimization which takes time and effort to implement.  A C
programmer very often has no need to do such optimizations, though he works
with code I find horrid by comparison.

 So now the issue becomes whats the cost of writing the more efficient
 approach in python code vs. writing the naive approach in c++ code. If you
 think you get enough programmer benefits from working in python to make
 those 2 costs equal, and the performance of either is good enough, python is
 the better choice. Not only that, once you've got good approaches written in
 python that are stable and you don't need the coolness/flexibility, it
 becomes much easier to port the stuff to C, or pyrex it or whatever makes it
 much, much faster.

The whole point of using Python, for me, is that it is far more flexible and
programmer-friendly than anything else I've seen.  I don't want to have to
make a choice between Python and C just on a matter of speed--Python should
be the clear choice.  They should be equal in speed, but Python is easier to
use.  Obvious choice?  Python.

 Sounds like your approach is O(N^2). If most points aren't close enough to
 do the collision, partitioning to make it so you don't even have to do the
 check will do wonders.

Again, this problem is merely an

Re: [pygame] Python and Speed

2008-04-17 Thread FT

Hi!

I am just making an observation on this and objects, maybe I am missing
the point, but when checking collisions, if you know your objects size, the
vertex, or point depending on direction, could you not not solve this by
just the direct line between the 2 objects and not the surface?

What I mean, you are in control of your game, you know your objects,
thus, you also know the point that will be hit depending on the direction
you are traveling in, so why not just check that point or small collection
of points?

That is what I would do when running this kind of game. For my Star Trek
game does not use the actual pixel, but the area in which it is in and that
is a larger area. But when using pixel points, then only the points that
fall in-line with the direction you are traveling in. That seems to me to a
much faster check, little to almost 1 single point is what seems to be the
result in this...

In other words you know your objects and where they are, then just draw
a straight line between them and calculate the edge at that line drawn. I am
not saying draw a line, just calculate to the edge of the object from
both...

Bruce



On Apr 17, 2008, at 12:26 PM, Ian Mallett wrote:
 On Thu, Apr 17, 2008 at 12:15 PM, Casey Duncan [EMAIL PROTECTED]
 wrote:Note this is not the most efficient way to do this, using a
 partitioned space you may be able to avoid comparing most points
 with one another most of the time. To do this in 2D you could use
 quad-trees, in 3D you could use oct-trees. See:
http://en.wikipedia.org/wiki/Octree
 Yes, I've tried this, but there are issues with points being in two
 separate places.  For example, if the collision radius is 5, and it
 is 3 away from the edge, then all the points in the neighboring
 trees must be tested.

Partitioned space is certainly a more complex algorithm, but so long
as all of your points (spheres?) are not close together, it is
usually vastly more efficient. If the partition size is optimal, than
the vast majority of particles will not be hitting the edge of a
partition, that will be an edge-case (pun intended). Even for those
that are it's usually still faster than the naive O(N^2) method that
compares every point with every other.

This algorithm is only effective if the space is large relative to the
collision geometries and they tend not to be clumped very close
together.

-Casey

Re: [pygame] Python and Speed

2008-04-17 Thread Ian Mallett

On Thu, Apr 17, 2008 at 12:45 PM, Casey Duncan [EMAIL PROTECTED] wrote:

 Partitioned space is certainly a more complex algorithm, but so long as
 all of your points (spheres?)

Yes, one can think of one array as points and the other as spheres.

 are not close together, it is usually vastly more efficient. If the
 partition size is optimal, than the vast majority of particles will not be
 hitting the edge of a partition, that will be an edge-case (pun intended).
 Even for those that are it's usually still faster than the naive O(N^2)
 method that compares every point with every other.

 This algorithm is only effective if the space is large relative to the
 collision geometries and they tend not to be clumped very close together.

In 3D space, there is a good deal of room, so I can see this being
effective.  But again, this is only one example of countless slowdowns I've
had.  Let's fix all of them in one fell swoop, no?  The program is slowing
down because the computer is processing the program relatively
inefficiently.  This is due to Python.  Programmers shouldn't have to
optimize their inefficiently executed code, the code should just be executed
efficiently.

 -Casey

Ian

Re: [pygame] Python and Speed

2008-04-17 Thread Ian Mallett

I'm not sure precisely what you mean...
Again, remember that this is an example, not the question.  The question is
How can Python be made Faster?  This is an example of one of the problems
resulting from Python's relative slowness.

Here's the example:
-There is a list of 3D points
-There is another list of 3D points.
-Every frame, for every point in the first list, if any point in the second
list is a certain 3D distance away, then there is a collision.

Ian

Re: [pygame] Python and Speed

2008-04-17 Thread Patrick Mullen

In that specific case, no matter which programming language you use,
your code will not be very fast.  Do you think programmers write
unoptomized code in c, and get speedy execution every time?  Have you
not ever used a program or played a game which ran slower than it
should, which was actually programmed in a fast language?  Optiomizing
code and improving algorithms have been around far longer than python,
and are an important part of programming in general.  As has been
mentioned before, there have been many attempts to optimize core
python, which have resulted in some improvements.  Python2.5 is
considerably faster than python1.0.  However, due to the nature of the
language it can only be optimized so much.  The best bet really, is to
write code in c that needs to be fast, and call that code from python,
using python as a glue language.  This can be accomplished using
implementations that already exist (pyode, numpy) or writing a new
implementation and exposing it with pyrex or other linking programs.

There is no magic bullet that will make python faster, and it's not
for lack of trying.  Even at it's most optimized that could
theoretically be done, I don't think pure python will ever be as fast
as c.  I do hope it gets close, but even if this were to be the case,
your collision detection code will still be slow as heck.  Culling
algorithms for this purpose were invented to speed up applications
written in C after all :)

Re: [pygame] Python and Speed

2008-04-17 Thread Richard Jones

On Thu, 17 Apr 2008, you wrote:
 No, this is the place to discuss it because if we wish to make games,
 work with existing platforms, and want speed, that is the way to go. Now
 that we have had this discussion, and found solutions, now we have a list
 of ways to resolve it.

Sure, discuss ways to work with the current interpreter, but it's quite 
pointless talking about changes to the interpreter itself. I'm almost certain 
none of the core devs are on this list, and the python-dev list is a much 
more appropriate place to discuss it anyway :)


Richard

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread Greg Ewing


Marcus von Appen wrote:


Should we change this to take alpha into account for the BLEND_* args as
well and only apply the default alpha blending, if no BLEND_* args are
passed?


Maybe what's needed is another set of flags to specify
what transfer function to use for the alpha channel,
independently of the RGB channels.

A good place to look for ideas on this would be the
set of blending functions specified by OpenGL. It
seems to be fairly comprehensive and is based on a
lot of experience of what people have found useful.

In particular, it has modes which specify the alpha
combining function separately, which seems to be
what is needed here.

--
Greg

Re: [pygame] Python and Speed

2008-04-17 Thread Greg Ewing


Patrick Mullen wrote:

Also, if you are using sqrt for your
distance check, you are likely wasting cpu cycles, if all you need to
know is whether they are close enough.


Also note that if you do need to compare exact Euclidean
distances for some reason, you can avoid square roots
by comparing the squares of the distances instead of
the distances themselves.

--
Greg

Re: [pygame] Python and Speed

2008-04-17 Thread Greg Ewing


Ian Mallett wrote:


Here's the example:
-There is a list of 3D points
-There is another list of 3D points.
-Every frame, for every point in the first list, if any point in the 
second list is a certain 3D distance away, then there is a collision.


The responses to it also provide a good example of a
very important principle: Often, using a better algorithm
can give you much greater gains than speeding up the
implementation of the one you're using.

So when something is too slow, the first thing you
should ask is: Does there exist a better algorithm
for what I'm trying to do?

--
Greg

Re: [pygame] Python and Speed

2008-04-17 Thread Greg Ewing


René Dudfield wrote:


- SIMD instructions are the fast ones...


It's doubtful there's much in the Python core that would
benefit from SIMD, though. Most of what it does doesn't
involve doing repetitive operations on big blocks of
data.

--
Greg

Re: [pygame] Python and Speed

2008-04-17 Thread Greg Ewing


Ian Mallett wrote:
How do you write an extension module in C and call it from Python?   
Someone gave some instructions earlier, but I found them too vague...


Another way I forgot to mention earlier is to use the
ctypes module (I often forget about it, because it wasn't
in the stdlib until very recently.)

That allows you to call compiled routines from a shared
library directly without having to write any C. It's
less efficient, though, as it has to go through some
Python wrapper objects to get there, and also more
dangerous, because you can easily crash the interpreter
if you don't get everything exactly right.

--
Greg

Re: [pygame] Python and Speed

2008-04-17 Thread Greg Ewing


Ian Mallett wrote:

Yes, I've tried this, but there are issues with points being in two 
separate places.  For example, if the collision radius is 5, and it is 3 
away from the edge, then all the points in the neighboring trees must be 
tested. 


Rather than a tree, you may be just as well off using a regular
array of cells. That makes it much easier to find the neighbouring
cells to test, and there's also less overhead from code to manage
and traverse the data structure.

The only time you would really need a tree is if the distribution
of the objects can be very clumpy, so that you benefit from an
adaptive subdivision of the space.

Another possibility to consider is instead of testing neighbouring
cells, insert each object into all cells that are within the
collision radius of it. That might turn out to be faster if the
objects don't move very frequently.

--
Greg

Re: [pygame] Python and Speed

2008-04-17 Thread Gary BIshop

ctypes is great stuff! I find it much harder to crash the interpreter 
with ctypes than with extensions I've developed and debugged. It is 
quite resilient. I've used it to interface with the Windows API to 
simulate keystrokes, to interface to a USB Digital IO interface, in a 
wrapper for the huge OpenCV library, to set the background image on my 
desktop, to adjust the system volume control, to interface to the 
Wiimote, and to wrap eSpeak to name a few. I've pretty much given up on 
swig and pyrex.


gb

Greg Ewing wrote:

Ian Mallett wrote:
How do you write an extension module in C and call it from Python?   
Someone gave some instructions earlier, but I found them too vague...


Another way I forgot to mention earlier is to use the
ctypes module (I often forget about it, because it wasn't
in the stdlib until very recently.)

That allows you to call compiled routines from a shared
library directly without having to write any C. It's
less efficient, though, as it has to go through some
Python wrapper objects to get there, and also more
dangerous, because you can easily crash the interpreter
if you don't get everything exactly right.

Re: [pygame] Python and Speed

2008-04-17 Thread FT


Hi!

Another thought, same as before but adding the other comment about bins.
If your object is heading in a certain direction and you know the surface
point of that object, now make a dictionary key point for it. Same for all
other objects, knowing there direction.
key=str(x)+str(-y)+str(-z) #Keeping all as integer values in a
coordinate grid.

The only thing you compare are the keys. Then you will say, wait a
minute, not both are going to collide at that point!

OK, then you know the direction of both and the point where they would
in fact collide. Now that point will be approaching both at the same point
in that straight line.

The vertex of the point may change where they intersect, but keep taking
the key value of that point because both will always have that vertex point
on that point on the surface.

Keep updating the dictionary key and compare the value for both with the
if key in...

That point is a key value and easy to check. It is not the surface, just
the intersection point of both straight lines of the object traveling
through free space...

So you will be checking 2 keys for 2 objects, 3 keys for 3 objects, and
so on...

The only addition to this is if you have an object that is not perfect
shape, like a boulder, where that outer edge will change depending on the
angle of your straight line to the vertex or intersection point.

Not checking surfaces, just checking the intersection point of the line,
for both will have to meet and the distance to that objects surface will
also be that straight line distance for the object center, which will still
direct you to the vertex/intersection point.

Both surface points that meet, have same key value, will also be the
collision point when they do in fact collide...

So all points will match, just another observation and thought in this
faster and faster check of objects.

For I do something like this with my primitive Battleship game. Instead
of an array, just keys for where an object part is located. When the missile
or shell hits that key it matches the objects location at that same key.
No completely filled huge array, just points.

When doing this you have up to 3 points to calculate: point of surface
of object 1, point on surface of object 2 and then the intersection point of
the vector of both objects. When all 3 points match with the same key, you
have your collision! Like I said before, the updating is the vector angle,
surface location and the intersection point of that straight line of the
vectors of both objects.

Bruce


René Dudfield wrote:

 - SIMD instructions are the fast ones...

It's doubtful there's much in the Python core that would
benefit from SIMD, though. Most of what it does doesn't
involve doing repetitive operations on big blocks of
data.

--
Greg

Re: [pygame] Python and Speed

2008-04-17 Thread Greg Ewing


Ian Mallett wrote:
Programmers shouldn't have to 
optimize their inefficiently executed code, the code should just be 
executed efficiently.


Even if your inefficient algorithm is being executed as fast
as possible, it's still an inefficient algorithm, and you
will run into its limitations with a large enough data set.
Then you will have to find a better algorithm anyway.

Part of the skill of being a good programmer is having
the foresight to see where such performance problems are
likely to turn up further down the track, and choosing
an algorithm at the outset that at least isn't going
to be spectacularly bad.

Doing this saves you work in the long run, since you
spend less time going back and re-coding things.

--
Greg

Re: [pygame] Python and Speed

2008-04-17 Thread Greg Ewing


Devon Scott-Tunkin wrote:

would you set out 2d partitioning the
screen in pygame by making say 4 invisible sprites
with rects ...  or by just using x,y values


Just use coordinates. Sprites are totally unnecessary,
since they're not something that appears on the screen.

--
Greg

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread René Dudfield

On Fri, Apr 18, 2008 at 7:36 AM, Greg Ewing [EMAIL PROTECTED] wrote:
 Marcus von Appen wrote:
  Maybe what's needed is another set of flags to specify
  what transfer function to use for the alpha channel,
  independently of the RGB channels.

+1

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread Lenard Lindstrom


Brian Fisher wrote:
On Thu, Apr 17, 2008 at 10:41 AM, Lenard Lindstrom [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


For alpha surfaces a blend may be useful for doing shadow effects.
Such effects require extracting and altering the alpha plane. Want
to make a shadow for antialiased text. Extract the alpha and apply
to an all black surface. Right now surfarray must be used to do
this in a practical way. Otherwise it would have to be done
per-pixel with get_at/set_at.

I'm curious - how exactly would you expect to be able to extract the 
alpha channel from a font char and apply it to a black surface using 
the blit flags? BLEND_MAX? Is there a technique that would also work 
for a surface with arbitrary color values?

I was thinking of this:

text = font.render(, True, some_color)
alpha = pygame.Surface(text.get_size(), pgyame.SRCALPHA, 32)
alpha.fill((0, 0, 0, 255))
alpha.blit(text, (0, 0), None, pygame.BLEND_MIN)

This assumes BLEND_MIN applies to alpha as well. If dA = sA + dA - 
((sA*dA)/255) then a destination color of (0, 0, 0, 0) will work. Source 
color is irrelevant. It extras alpha from any image.


--
Lenard Lindstrom
[EMAIL PROTECTED]

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread René Dudfield

Also, I guess the docs should be updated to specify which blend
operations the flags do exactly.


On Fri, Apr 18, 2008 at 10:49 AM, Lenard Lindstrom [EMAIL PROTECTED] wrote:
 Brian Fisher wrote:


  On Thu, Apr 17, 2008 at 10:41 AM, Lenard Lindstrom [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote:
 
 For alpha surfaces a blend may be useful for doing shadow effects.
 Such effects require extracting and altering the alpha plane. Want
 to make a shadow for antialiased text. Extract the alpha and apply
 to an all black surface. Right now surfarray must be used to do
 this in a practical way. Otherwise it would have to be done
 per-pixel with get_at/set_at.
 
  I'm curious - how exactly would you expect to be able to extract the alpha
 channel from a font char and apply it to a black surface using the blit
 flags? BLEND_MAX? Is there a technique that would also work for a surface
 with arbitrary color values?
 
  I was thinking of this:

  text = font.render(, True, some_color)
  alpha = pygame.Surface(text.get_size(), pgyame.SRCALPHA, 32)
  alpha.fill((0, 0, 0, 255))
  alpha.blit(text, (0, 0), None, pygame.BLEND_MIN)

  This assumes BLEND_MIN applies to alpha as well. If dA = sA + dA -
 ((sA*dA)/255) then a destination color of (0, 0, 0, 0) will work. Source
 color is irrelevant. It extras alpha from any image.

  --

  Lenard Lindstrom
  [EMAIL PROTECTED]

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread Lenard Lindstrom


René Dudfield wrote:

On Fri, Apr 18, 2008 at 7:36 AM, Greg Ewing [EMAIL PROTECTED] wrote:
  

Marcus von Appen wrote:
 Maybe what's needed is another set of flags to specify
 what transfer function to use for the alpha channel,
 independently of the RGB channels.



+1
  
This means inlined blending code will have to be replaced with indirect 
function calls. Otherwise, with all the possible permutations of 
blending, alphablit.c will balloon in size uncontrollably.


--
Lenard Lindstrom
[EMAIL PROTECTED]

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread Greg Ewing


Lenard Lindstrom wrote:
This means inlined blending code will have to be replaced with indirect 
function calls. Otherwise, with all the possible permutations of 
blending, alphablit.c will balloon in size uncontrollably.


You could make two passes over the surface, one to
update the RGB and one to update the alpha.

Or you could have a set of blending operations that
update the alpha only, and let the user make two
blit calls.

--
Greg

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread Greg Ewing


René Dudfield wrote:

... or we use a runtime assembler to generate the required blitters at
run time ;)


Or reimplement pygame on top of OpenGL...

Isn't someone already working on that?

--
Greg

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread René Dudfield

hi,

yeah, you can do pygame on top of opengl in a few different ways.  SDL
1.3/2 has opengl support built in.  Also you can use things like
lamina (http://pitchersduel.python-hosting.com/browser/branches/Lamina/)
to use opengl with SDL 1.2.x.  But, obviously opengl sucks in certain
ways - like it doesn't exist at all on some computers, or is buggy -
which is why we use software.  So I guess this discussion is about
improving the software blitters :)

cu,



On Fri, Apr 18, 2008 at 11:25 AM, Greg Ewing
[EMAIL PROTECTED] wrote:
 René Dudfield wrote:

  ... or we use a runtime assembler to generate the required blitters at
  run time ;)
 

  Or reimplement pygame on top of OpenGL...

  Isn't someone already working on that?

  --
  Greg

Re: [pygame] Unexpect blit behavior when using BLEND_ADD

2008-04-17 Thread Lenard Lindstrom


Greg Ewing wrote:

Lenard Lindstrom wrote:
This means inlined blending code will have to be replaced with 
indirect function calls. Otherwise, with all the possible 
permutations of blending, alphablit.c will balloon in size 
uncontrollably.


You could make two passes over the surface, one to
update the RGB and one to update the alpha.

Or you could have a set of blending operations that
update the alpha only, and let the user make two
blit calls.

Having a separate blit_alpha might be the way to go. This may actually 
remove intermediary steps involving an alpha-only surface. Also, it can 
be abstracted into a general purpose color plane toolkit. While 
PixelArray is good at carving up a surface it doesn't replace Numeric or 
NumPy for doing blanket operations on a surface's contents.


--
Lenard Lindstrom
[EMAIL PROTECTED]

Re: [pygame] Python and Speed

2008-04-17 Thread Ian Mallett

On Thu, Apr 17, 2008 at 3:03 PM, Greg Ewing [EMAIL PROTECTED]
wrote:

 Patrick Mullen wrote:

  Also, if you are using sqrt for your
  distance check, you are likely wasting cpu cycles, if all you need to
  know is whether they are close enough.

 Nope; in this case, the calculations' margin of error must be very small.


 Also note that if you do need to compare exact Euclidean
 distances for some reason, you can avoid square roots
 by comparing the squares of the distances instead of
 the distances themselves.

I already do that.
On Thu, Apr 17, 2008 at 4:02 PM, Greg Ewing [EMAIL PROTECTED]
wrote:

 Rather than a tree, you may be just as well off using a regular
 array of cells. That makes it much easier to find the neighbouring
 cells to test, and there's also less overhead from code to manage
 and traverse the data structure.

I meant that.  What is the difference between a tree and a cell?  Cells are
regular?
Anyway, I had planned to do cells.

 The only time you would really need a tree is if the distribution
 of the objects can be very clumpy, so that you benefit from an
 adaptive subdivision of the space.

 Another possibility to consider is instead of testing neighbouring
 cells, insert each object into all cells that are within the
 collision radius of it. That might turn out to be faster if the
 objects don't move very frequently.

I like that idea.
Still, the objects can and do move.
On Thu, Apr 17, 2008 at 4:23 PM, Greg Ewing [EMAIL PROTECTED]
wrote:

 There are an extremely large number of modifications
 that could be made to Python. Only a very small number
 of them will result in any improvement, and of those,
 all the easy-to-find ones have already been found.

The harder ones must then be attacked--solving a difficult speed issue might
save the tiresome implementation of optimizations on the part of hundreds of
users.

 If you want to refute that, you're going to have to
 come up with an actual, specific proposal, preferably
 in the form of a code patch together with a benchmark
 that demonstrates the improvement. If you can't
 do that, you're not really in a position to make
 statements like it can't be that hard.

Like I said, because I am the programmer, not the Python modifier, it is my
job to make the programs run fast.  By can't be that hard, I mean that if
C++ can do it, Python should be able to too.  Obviously, I don't know how
Python is structured, and I doubt I have the experience of the people on the
Python team, but if I can make optimizations in my code, they should be able
to make modifications in Python.

 If wishing could make it so, Python would already
 be blazingly fast!

Ian is wishing...
On Thu, Apr 17, 2008 at 4:32 PM, Greg Ewing [EMAIL PROTECTED]
wrote:

 Even if your inefficient algorithm is being executed as fast
 as possible, it's still an inefficient algorithm, and you
 will run into its limitations with a large enough data set.
 Then you will have to find a better algorithm anyway.

 Part of the skill of being a good programmer is having
 the foresight to see where such performance problems are
 likely to turn up further down the track, and choosing
 an algorithm at the outset that at least isn't going
 to be spectacularly bad.

 Doing this saves you work in the long run, since you
 spend less time going back and re-coding things.

Not necessarily.  I've had situations where I've decided to do something,
then draft-coded it, then decided that the game feature wasn't necessary,
was the wrong approach, or simply to scrap the entire project.  If I had
decided to spend the time to code something with all of the optimizations,
it would take longer.  When I delete it, that extra effort would have been
worthless.  Once a feature is implemented, tested, and so on, I decide if it
is needed, then add optimizations and comments.  Because optimizations often
rewrite code in a more efficient way, the original code works as a guideline
for the operation.  All this saves me effort in the long-run; but I can't
speak for anyone else.

 Greg

Ian

Re: Re: [pygame] Python and Speed

2008-04-17 Thread Richard Jones

Ian Mallett [EMAIL PROTECTED] wrote:
 ... if C++ can do it, Python should be able to too.  Obviously, I don't know 
 how Python is structured ...

Then please learn more about CPython's internals and the general problem of 
optimising a dynamic language like Python. The CPython code is incredibly 
well-written and easy to understand, even with the various complications that 
exist due to current optimisations* and there's plenty of academic research 
into dynamic languages out there. I'm sure the core devs would welcome your 
patches with open arms!


I'll leave you with Greg's wisdom, which perhaps needs repeating:

  If wishing could make it so, Python would already be blazingly fast!


 Richard

* for example the double dict-lookup implementations that seamlessly replace 
the string-only lookup with an arbitrary-object lookup on the first non-string 
lookup, or the function-call frame caching mechanism that I had a hand in 
implementing...

42 matches

Mail list logo