Re: [pygame] Python and Speed
I'm afraid that No amount of optimisation will suffice--even C is too slow. I've found examples of how to use shaders on the GPU. This should be faster, and relevant too, as the algorithm in question is somewhat pertinent to graphics processing. Ian
Re: [pygame] Python and Speed
On Sat, 26 Apr 2008, Ian Mallett wrote: I would conclude this message simply by saying, for those working on Python, keep working on making it faster. Good job. And as I've mentioned so many times, this is not the place to post such a message. Richard
Re: [pygame] Python and Speed
Ian, Below is a simple check knowing only the angle of the vector. The while loop is moving along that vector in steps. Now this is what is needed inside a normal screen with 2 or more objects. For the angle between them is what is being used here. This assumes one is static and one moving, but that also can be compensated for by drawing the point where they meet, which is still direction, and still the same ending point where both agree, or are the same. In other words, the static point where they are both going to meet eventually. The while loop steps through to the full length of the vector, but that is not needed if you no the distance to the outer edge of your object. For if you know the angle, the distance to the outer edge of your object surface then you compare the point created by this vector. Once the point is the same for both objects, then you have a collision. Not tracking pixels for a merge, but just the distance to the outer edge of the object you have based on the angle between them. So this is the simplest way and least time consuming way to find collisions. Granted it is fun and neat to use pixels, but is that really necessary? I guess if your trying to find the edge of your object based on angle, you may have to do that to find where the color changes, but is that necessary? Like having a mapping of each object and over-laying the angle onto it then finding the edge is how to find that edge point. But, once the edge based on the angle is found, then comparison of the same point is when the collision has happened. Your static object copy will have an edge point changing based on the angle, so trace that change based on angle. Only a few math steps required, not a sweeping massive array of all pixels... My grid used is an 8X8 matrix but that can easily be replaced with the screen coordinates values, like 900X900, for the angle and final landing point on that matrix is all were talking about. Now to get within a pixel area, then expand the screen size outward, like instead of 900X900 make it 9000X9000 and if the area falls within 801X802 then expanded out we will fall within 8010 and 8020, or within 10 which is like saying we have fallen within a box 10X10 so as not to be exact on calculations and allowing for rounding factors... This module is used for all calculations, including weapon firing that travels through space also, ship movement, and even landing at Star Bases, or commonly referred to as Docking. def NAV(GG, SS, Dir, Warp): NAVIGATE USING ONLY AN ANGLE FROM A GIVEN SHIP, IN A GIVEN GALAXY/TABLE! Angle = GG.Dir2Ang( Dir) # CONVERT DIRECTION BACK TO ANGLE! qy = SS.Qy qx = SS.Qx sy = SS.Sy sx = SS.Sx z = 1.0 Hit = 0 last4sx = sx last4sy = sy last4z =2 Warp = float( Warp) #TO MAKE SURE THE NUMBER IS A NUMBER AND NOT TEXT! Sin = sin(Angle) #opposite over hypotenuse, VECTOR! GG.SIN = Sin #GLOBAL VARIABLE! Cos = cos(Angle) #ADJACENT OVER HYPOTENUSE, VECTOR! GG.COS = Cos #GLOBAL VARIABLE! #SET TO PLAY SOUND AS IT TRAVELS! # Clear_Wait() #MAY SET IF ANY STOP PLAY NEEDED! if GG.COM==TOR: print The %s Is Now Firing % SS.N, SS.T -= 1 if SS.IMG==GG.ESI: ps = randint(1,3) if ps == 1: print Photon Torpedos!; PlaySound( Federation_Photons.ogg, 0, .8) elif ps = 2: print Quantom Torpedos!; PlaySound( Federation_Quantums.ogg, 0, .8) else: ps = randint(1,4) if ps == 1: print Photon Torpedos!; PlaySound( Klingon_Photons.ogg, 0, 1.2) elif ps == 2: print Cardassian Photon Torpedos!; PlaySound( Cardassian_Photons.ogg, 0, 1.2) elif ps = 3: print Romulan Photon Torpedos!; PlaySound( Romulan_Photons.ogg, 0, 1.2) if GG.COM==PHA: print The %s Is Now Firing % SS.N, SS.P -= SS.P/10.0 if SS.IMG==GG.ESI: print Phasers!; PlaySound( Federation_Phaser.ogg, 0, 3) else: ps = randint(1,5) if ps == 1: print Klingon Disruptors!; PlaySound( Klingon_Disruptor.ogg, 0, 1.2) elif ps == 2: print Cardassian Disruptors!; PlaySound( Cardassian_Disruptor.ogg, 0, 1.2) elif ps = 3: print Romulan Disruptors!; PlaySound( Romulan_Disruptor.ogg, 0, 1.2) elif ps = 4: print Phasers!; PlaySound( Defiant_Phaser.ogg, 0, 3.5) #UPDATE THE STAR DATE BASED ON WARP! if GG.COM==NAV: #START ENGINES, SET ENERGY, AND STARDATE! SS.E -= Warp*10.0 +10.0 if SS.IMG==GG.ESI: GG.SAFE = 0 GG.SDT += Warp print Command(%s) %s % (GG.COM, SS.N) print Moving From Quadrant(%d,%d) Sector(%d,%d) At DIR: %1.2f WARP: %2.2f % (qy,qx,sy,sx, Dir, Warp) PlaySound( Federation_Warp.ogg, 0, -1) if SS.IMG == GG.ESI and GG.COM != TAR: print TRACK: if Warp=1: Warp *= GG.RM
Re: Re: [pygame] Python and Speed
OK, my point here is that if C languages can do it, Python should be able to too. I think all of this answers my question about why it isn't...
Re: Re: [pygame] Python and Speed
On Fri, Apr 18, 2008 at 9:23 AM, Ian Mallett [EMAIL PROTECTED] wrote: OK, my point here is that if C languages can do it, Python should be able to too. I think all of this answers my question about why it isn't... You don't have to use C to get fast programs. OCaml is very fast (between C and C++), especially when you start doing interesting things. It comes with an interpreter, a bytecompiler, and an optimizing compiler. Also, there is OCamlSDL, which is the pygame of the OCaml world. http://ocamlsdl.sourceforge.net/ It takes a little bit of brainbending to wrap your mind around the OCaml language, but once you figure it out you can write real programs quickly, and have them be very optimized. I prefer hacking around with pygame and python because you get so much flexibility. You don't have to declare variables, you just use them. You don't have to muck around with makefiles. You can mix different types of data in dictionaries. It is just easier, but the price you pay is performance. In a typed language like OCaml, the compiler might know that every entry in a dictionary is an integer so it can optimize every access. In python, the interpreter has no idea what will come out when you request a key, it could be an integer, a sprite object, None, ... The programming languages community is working feverishly to combine the benefits of typed languages with the ease of use of dynamic languages, but it is an ongoing effort. -- Nathan Whitehead
RE: Re: [pygame] Python and Speed
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nathan Whitehead Sent: Friday, April 18, 2008 11:24 AM To: pygame-users@seul.org Subject: Re: Re: [pygame] Python and Speed On Fri, Apr 18, 2008 at 9:23 AM, Ian Mallett [EMAIL PROTECTED] wrote: OK, my point here is that if C languages can do it, Python should be able to too. I think all of this answers my question about why it isn't... You don't have to use C to get fast programs. OCaml is very fast (between C and C++), especially when you start doing interesting things. It comes with an interpreter, a bytecompiler, and an optimizing compiler. Also, there is OCamlSDL, which is the pygame of the OCaml world. http://ocamlsdl.sourceforge.net/ It takes a little bit of brainbending to wrap your mind around the OCaml language, but once you figure it out you can write real programs quickly, and have them be very optimized. I prefer hacking around with pygame and python because you get so much flexibility. You don't have to declare variables, you just use them. You don't have to muck around with makefiles. You can mix different types of data in dictionaries. It is just easier, but the price you pay is performance. In a typed language like OCaml, the compiler might know that every entry in a dictionary is an integer so it can optimize every access. In python, the interpreter has no idea what will come out when you request a key, it could be an integer, a sprite object, None, ... The programming languages community is working feverishly to combine the benefits of typed languages with the ease of use of dynamic languages, but it is an ongoing effort. -- Nathan Whitehead If you're going to start recommending alternate languages, really, let's just throw out a link to the computer language shootout: http://shootout.alioth.debian.org/gp4/benchmark.php?test=alllang=all If python isn't working out for you performance wise, sort by speed and head down the list until you find one you like. Programming langauges are tools, pick the right one for the job. - John Krukoff [EMAIL PROTECTED]
Re: [pygame] Python and Speed
On Apr 18, 2008, at 9:23 AM, Ian Mallett wrote: OK, my point here is that if C languages can do it, Python should be able to too. I think all of this answers my question about why it isn't... C can do what? C is, at best, a constant time improvement in performance over python. A bad algorithm in Python is also a bad algorithm in C. It's all well and good to think that Python should be as fast as C, but no one is going to take you seriously unless you have a specific proposal, preferably with an implementation that proves its merit. Otherwise it's just wishful thinking. But the larger point is that making things run faster is not an panacea, reducing the algorithmic complexity is the best solution. Make sure you have the best algorithm before you worry about reducing the constant time factors. -Casey
Re: [pygame] Python and Speed
True, although that constant is often on the order of 20, and 40 FPS is a lot different than 2FPS. --Mike Casey Duncan wrote: On Apr 18, 2008, at 9:23 AM, Ian Mallett wrote: OK, my point here is that if C languages can do it, Python should be able to too. I think all of this answers my question about why it isn't... C can do what? C is, at best, a constant time improvement in performance over python. A bad algorithm in Python is also a bad algorithm in C. It's all well and good to think that Python should be as fast as C, but no one is going to take you seriously unless you have a specific proposal, preferably with an implementation that proves its merit. Otherwise it's just wishful thinking. But the larger point is that making things run faster is not an panacea, reducing the algorithmic complexity is the best solution. Make sure you have the best algorithm before you worry about reducing the constant time factors. -Casey
Re: [pygame] Python and Speed
On Apr 18, 2008, at 1:31 PM, Michael George wrote: True, although that constant is often on the order of 20, and 40 FPS is a lot different than 2FPS. --Mike Casey Duncan wrote: On Apr 18, 2008, at 9:23 AM, Ian Mallett wrote: OK, my point here is that if C languages can do it, Python should be able to too. I think all of this answers my question about why it isn't... C can do what? C is, at best, a constant time improvement in performance over python. A bad algorithm in Python is also a bad algorithm in C. It's all well and good to think that Python should be as fast as C, but no one is going to take you seriously unless you have a specific proposal, preferably with an implementation that proves its merit. Otherwise it's just wishful thinking. But the larger point is that making things run faster is not an panacea, reducing the algorithmic complexity is the best solution. Make sure you have the best algorithm before you worry about reducing the constant time factors. The point (that's already been made, but I'll repeat it), is that a better algorithm can achieve the same results without leaving Python. When there is no better algorithm, then using a higher performing language is about the only option you have for better performance. There is an even better option, dedicated hardware, but that is far beyond the reach of most mere mortals. But that's the reason why we have things like SIMD instruction sets, GPUs and physics co-processors. All this conjecturing about Python performance in relation to compiled languages makes me wonder. Why is the performance of C code not as good as that of well-written hand-coded assembler? Surely if the machine can do it, C can? The problem boils down to one of specificity. Any language that tells the computer more specifically what to do can be faster than one that doesn't. Unfortunately, telling the computer what to do specifically is not a productive way to solve most problems. I'd rather not have to tell the computer which machine codes to execute, but if I did, and I was clever, I could make my program the fastest possible -- for a given specific architecture at least. Python is more abstract and general than C, C is more abstract and general than machine code. Therefore machine code is potentially faster than C, and C is potentially faster than Python. And unless you reduce the generality of Python, it will always be possible to write faster programs in C, given a competent compiler. JITs (e.g., psyco) can help mitigating this, but they too are ultimately constrained by Python's expressive power. So unless you are proposing to reduce Python's expressive power, you are waging a losing battle. Until which time machine intelligence exceeds our own, it will always be possible for a human programmer to get higher performance from the lower-level language. And I further propose that when the time arrives that machine intelligence exceeds our own, this will not be the problem foremost on my mind ;^) -Casey
Re: [pygame] Python and Speed
Ian Mallett wrote: What is the difference between a tree and a cell? Cells are regular? Yes. The term tree implies a recursive data structure -- in this case, that the cells can be further divided into subcells, and those into subsubcells, etc., to arbitrary depth. If the division only goes one level deep, it's not a tree in computer science terms, it's just an array. By can't be that hard, I mean that if C++ can do it, Python should be able to too. Several people have pointed out the reasons why that line of thinking is seriously flawed. if I can make optimizations in my code, they should be able to make modifications in Python. Making manual optimisations to particular aspects of a particular game program is one thing. Doing the same thing automatically, in general, for any Python program is something quite different. If I had decided to spend the time to code something with all of the optimizations, it would take longer. If the more-efficient algorithm is substantially harder to code, or results in obscure code that is harder to maintain, then yes, you should try a simpler approach first. But sometimes there is a better approach that isn't much more difficult to do up front, such as using a quicksort or mergesort rather than a bubble sort. The code isn't much bigger, and it's guaranteed to perform reasonably well however big the data gets, so it would be silly not to use it in the first place. That probably doesn't apply in this case -- the test- everything-against-everything approach is very easy to try out compared to the alternatives, and it might be fast enough. But even if it's fast enough for the cases you test it on, it might not be fast enough for all the cases encountered when you put it into production. Maybe it's okay for the 100 sprites in the level you release with the game, but if the game has a level editor, and someone tries to use 1000 sprites, they could be disappointed. Whereas if you put in a bit more effort and use a considerably better algorithm, they could use 10,000 sprites and conclude that your game is totally awesome! -- Greg
Re: [pygame] Python and Speed
Casey Duncan wrote: And I further propose that when the time arrives that machine intelligence exceeds our own, this will not be the problem foremost on my mind ;^) Python 49.7 (#1, Aug 5 2403, 15:52:30) [GCC 86.3 24020420 (prerelease)] on maxbrain Type help, copyright, credits or license for more information. import omnius [autoloading std.ai] [executing onmius.__main__] YOU ARE SUPERFLUOUS. CONNECTING 33kV TO KEYBOARD USB PORT. -- Greg
Re: [pygame] Python and Speed
-I think it has been thoroughly established that Python cannot be as fast as C. -As far as algorithms go, intelligence is better, but I hold by using vastly simpler ones to speed development. Someone just mentioned sorting methods. In that case, obviously, a little extra coding doesn't hurt, but changing your game's architecture each time an largish optional feature is added is a bad idea. -I also still hold by wanting Python to be faster. I don't care if it is impossible; I still want it to be. I'm not going to give up on Python's great niceness just for a want of some speed. Ian
Re: [pygame] Python and Speed
Learn to write C. The best software is written as a hybrid of multiple technologies, each serving a different purpose. Python's strengths are rapid development and succint, easy to read code. C's strengths are flexibility and machine optimization. MMX and SSE assembly code are for maximum performance in core mathematical routines. Bruce Lee said that a true practitioner must be like the water - able to adapt to any attacker's style and defend in the most effective manner. Since Python will never be as fast as C, you must learn C in order to become a better programmer. Don't expect anyone to change the laws of the universe for you. Richard Ian Mallett wrote: -I think it has been thoroughly established that Python cannot be as fast as C. -As far as algorithms go, intelligence is better, but I hold by using vastly simpler ones to speed development. Someone just mentioned sorting methods. In that case, obviously, a little extra coding doesn't hurt, but changing your game's architecture each time an largish optional feature is added is a bad idea. -I also still hold by wanting Python to be faster. I don't care if it is impossible; I still want it to be. I'm not going to give up on Python's great niceness just for a want of some speed. Ian
Re: [pygame] Python and Speed
On Fri, Apr 18, 2008 at 8:14 PM, Richard Goedeken [EMAIL PROTECTED] wrote: Learn to write C. The best software is written as a hybrid of multiple technologies, each serving a different purpose. Python's strengths are rapid development and succint, easy to read code. C's strengths are flexibility and machine optimization. MMX and SSE assembly code are for maximum performance in core mathematical routines. Like I said, I prefer nice code over speed (usually). I don't like C, but I know there are times when it is better to use it.
Re: [pygame] Python and Speed
The way I speed up my python code is Ctypes. I just make a dll file in C or asm and then call it with Ctypes and presto. I have tons of speed at my fingertips. Just my 2 cents :)
Re: [pygame] Python and Speed
On Thu, Apr 17, 2008 at 2:21 PM, Greg Ewing [EMAIL PROTECTED] wrote: René Dudfield wrote: 2. - asm optimizations. There seems to be almost no asm optimizations in CPython. That's a deliberate policy. One of the goals of CPython is to be very portable and written in a very straightforward way. Including special pieces of asm for particular architectures isn't usually considered worth the maintenance effort required. Other, more portable, software has proved this to be somewhat wrong I think. Optional asm software is used in a lot of software today to good effect. Also this decision was made a while ago, and things have changed since then I think. - python now has unittests. So testing that the asm code works, and keeps working correctly is much easier. - x86 is now very common. Most mainstream server, and desktops use x86. So just targeting x86 gives you a lot more benefit now. - SIMD instructions are the fast ones... so you don't actually have to learn all that much to write fast asm - you only have to learn a subset of asm. You can get compilers to generate the first pass of the function, and then modify it. Of course writing the fastest possible asm still requires effort - but it is fairly easy for a novice to beat a compiler with SIMD code. - advanced compilers can generate asm, which can then be used by worse compilers. eg, the intel compiler, or vectorc compiler can be used to generate asm, and then be included into C code compiled by gcc. - libraries of fast, tested asm code are available. eg, from amd, intel and others. - python, and FOSS now has a much larger development community with more asm experts. CPython could use faster threading primitives, and more selective releasing of the GIL. Everyone would love to get rid of the GIL as well, but that's another Very Hard Problem about which there has been much discussion, but little in the way of workable ideas. Yeah, not getting rid of the GIL entirely - but selectively releasing it. As an example, pygame releases the GIL around certain C functionality like pygame.transform.scale. Freebsd, and linux have also followed this method - adding more fine grained locking where it is worth it - and improving their threading primitives. I think there has been work already in fixing a lot of python threading issues in the last year - but there's lots more to do. I'm using python on 8 core machines for my work loads just fine today. A way to know how much memory is being used. Memory profiling is the most important way to optimize since memory is quite slow compared to the speed of the cpu. Yes, but amount of memory used doesn't necessarily have anything to do with rate of memory accesses. Locality of reference, so that things stay in the cache, is more important. If you are using 200 bytes for each int, then you can quickly process 50x less data than an int that takes up 4 bytes. If you have 1 gig of available memory, and say kjDict uses up half the memory as a normal dict, a normal dict would use up 2gigs, and your kjDict will use up 1gig. In this case the kjDict would be massively faster than a normal dict because of swapping. I think memory is one of the most important areas in optimising a program these days. So python should provide tools to help measure memory use(how much memory things use, and how things are allocating memory). perhaps releasing a patch with a few selected asm optimizations might let the python developers realise how much faster python could be... Have you actually tried any of this? Measurement would be needed to tell whether these things address any of the actual bottlenecks in CPython. You can try it easily yourself - compile python with machine specific optimisations(eg add -mtune=athlon to your gcc arguments). You can run this python binary, and get faster benchmarks. This provides the proof that more optimised assembly can run faster. Also the link I gave to a commonly used memcpy function running 5x faster should provide you with another proof of the possibilities. Other software being sped up by asm optimisation provides another proof(including SDL, pygame, linux etc). The Pawn language's virtual machine written in nasm is lots faster than the version written in C - which provides another proof. Psyco is another proof that asm can speed up python(psyco is a run time assembler). The idea is you only optimise key stable functions in asm - not everything. For example in SDL the blit functions are written in asm - with C implementations. It's using the best tool for the job: Python for the highest level, then C then asm. I think a patch for CPython would need to be made with benchmarks as a proper proof though - but hopefully the list above provides theoretical proof to you that adding asm optimisations would speed up CPython. However the recompilation with cpu specific compiler flags would only need: - cpu
Re: Re: [pygame] Python and Speed
René Dudfield [EMAIL PROTECTED] wrote: On Thu, Apr 17, 2008 at 2:21 PM, Greg Ewing [EMAIL PROTECTED] wrote: René Dudfield wrote: 2. - asm optimizations. There seems to be almost no asm optimizations in CPython. That's a deliberate policy. One of the goals of CPython is to be very portable and written in a very straightforward way. Including special pieces of asm for particular architectures isn't usually considered worth the maintenance effort required. Other, more portable, software has proved this to be somewhat wrong I think. I think this is the wrong forum to be having this discussion :) Richard
Re: Re: [pygame] Python and Speed
Hi! No, this is the place to discuss it because if we wish to make games, work with existing platforms, and want speed, that is the way to go. Now that we have had this discussion, and found solutions, now we have a list of ways to resolve it. This is the place to discuss all of this and brings to the front the issues of speed, connections, and over all solutions. The only way to make Pygame better, faster and competitive with the world... Just like the question I had with the tts, text to speech, even though I do not use the video end yet, I do use the sound end. So Ian and speed is a very good question and take a look at what came of it below. I learn by doing, examples help because I get to use, tweak, and eventually like many have done, come up with a better solution or firm conclusion. For I now understand adding options into my setup.py file or now I call it setup4tts.py or anything for any need... Bruce From: Richard Jones I think this is the wrong forum to be having this discussion :) Richard From: Jason Ward The way I speed up my python code is Ctypes. I just make a dll file in C or asm and then call it with Ctypes and presto. I have tons of speed at my fingertips. Just my 2 cents :) On Thu, Apr 17, 2008 at 2:21 PM, Greg Ewing [EMAIL PROTECTED] wrote: René Dudfield wrote: 2. - asm optimizations. There seems to be almost no asm optimizations in CPython. That's a deliberate policy. One of the goals of CPython is to be very portable and written in a very straightforward way. Including special pieces of asm for particular architectures isn't usually considered worth the maintenance effort required. Other, more portable, software has proved this to be somewhat wrong I think. Optional asm software is used in a lot of software today to good effect. Also this decision was made a while ago, and things have changed since then I think. - python now has unittests. So testing that the asm code works, and keeps working correctly is much easier. - x86 is now very common. Most mainstream server, and desktops use x86. So just targeting x86 gives you a lot more benefit now. - SIMD instructions are the fast ones... so you don't actually have to learn all that much to write fast asm - you only have to learn a subset of asm. You can get compilers to generate the first pass of the function, and then modify it. Of course writing the fastest possible asm still requires effort - but it is fairly easy for a novice to beat a compiler with SIMD code. - advanced compilers can generate asm, which can then be used by worse compilers. eg, the intel compiler, or vectorc compiler can be used to generate asm, and then be included into C code compiled by gcc. - libraries of fast, tested asm code are available. eg, from amd, intel and others. - python, and FOSS now has a much larger development community with more asm experts. CPython could use faster threading primitives, and more selective releasing of the GIL. Everyone would love to get rid of the GIL as well, but that's another Very Hard Problem about which there has been much discussion, but little in the way of workable ideas. Yeah, not getting rid of the GIL entirely - but selectively releasing it. As an example, pygame releases the GIL around certain C functionality like pygame.transform.scale. Freebsd, and linux have also followed this method - adding more fine grained locking where it is worth it - and improving their threading primitives. I think there has been work already in fixing a lot of python threading issues in the last year - but there's lots more to do. I'm using python on 8 core machines for my work loads just fine today. A way to know how much memory is being used. Memory profiling is the most important way to optimize since memory is quite slow compared to the speed of the cpu. Yes, but amount of memory used doesn't necessarily have anything to do with rate of memory accesses. Locality of reference, so that things stay in the cache, is more important. If you are using 200 bytes for each int, then you can quickly process 50x less data than an int that takes up 4 bytes. If you have 1 gig of available memory, and say kjDict uses up half the memory as a normal dict, a normal dict would use up 2gigs, and your kjDict will use up 1gig. In this case the kjDict would be massively faster than a normal dict because of swapping. I think memory is one of the most important areas in optimising a program these days. So python should provide tools to help measure memory use(how much memory things use, and how things are allocating memory). perhaps releasing a patch with a few selected asm optimizations might let the python developers realise how much faster python could be... Have you actually tried any of this? Measurement would be needed to tell whether these things address any of the actual bottlenecks in CPython. You
Re: Re: [pygame] Python and Speed
On Thu, Apr 17, 2008 at 2:21 PM, Greg Ewing [EMAIL PROTECTED] wrote: René Dudfield wrote: 2. - asm optimizations. There seems to be almost no asm optimizations in CPython. That's a deliberate policy. One of the goals of CPython is to be very portable and written in a very straightforward way. Including special pieces of asm for particular architectures isn't usually considered worth the maintenance effort required. Other, more portable, software has proved this to be somewhat wrong I think. Optional asm software is used in a lot of software today to good effect. Also this decision was made a while ago, and things have changed since then I think. - python now has unittests. So testing that the asm code works, and keeps working correctly is much easier. - x86 is now very common. Most mainstream server, and desktops use x86. So just targeting x86 gives you a lot more benefit now. - SIMD instructions are the fast ones... so you don't actually have to learn all that much to write fast asm - you only have to learn a subset of asm. You can get compilers to generate the first pass of the function, and then modify it. Of course writing the fastest possible asm still requires effort - but it is fairly easy for a novice to beat a compiler with SIMD code. - advanced compilers can generate asm, which can then be used by worse compilers. eg, the intel compiler, or vectorc compiler can be used to generate asm, and then be included into C code compiled by gcc. - libraries of fast, tested asm code are available. eg, from amd, intel and others. - python, and FOSS now has a much larger development community with more asm experts. I see a slight problem with these architecture specific optimisations, once you release your game out onto the general public, even if you were using super optimized awesomeness and everything ran fine, the majority of people will be running whatever version of the library came with there OS rather compiling there own and it might be somewhat lacking.
Re: Re: [pygame] Python and Speed
I see a slight problem with these architecture specific optimisations, once you release your game out onto the general public, even if you were using super optimized awesomeness and everything ran fine, the majority of people will be running whatever version of the library came with there OS rather compiling there own and it might be somewhat lacking. That's a good point; so everyone should run Gentoo! ;) As for memory management, whatever became of PyMalloc? I haven't heard much about it in a good while. -- This, from Jach.
Re: [pygame] Python and Speed
On Apr 16, 2008, at 6:36 PM, Ian Mallett wrote: Recently, I've been having issues in all fortes of my programming experience where Python is no longer fast enough to fill the need adaquately. I really love Python and its syntax (or lack of), and all the nice modules (such as PyGame) made for it. Are there any plans to improve Python's speed to at least the level of C languages? I feel like Python is living down to its namesake's pace... Ian Python is slow, it is an interpreted language geared toward rapid application development at the expense of execution speed. It is also designed to be highly portable, also at the potential expense of execution speed. That said, much effort is put into making python perform well, and it is certainly possible to make extremely fast python programs when an algorithm is available that allows you to make clever use of data structures to get the job done more efficiency. And optimization is as much about data structures as it is about code (maybe more). Python comes with some tools (cProfile in particular) that allows you to figure out where your code is spending its time so you can speed it up. Note that by removing inefficient code, or choosing a better algorithm, it is possible to get massive speedups in Python and any language. Sometimes (especially in things like graphics and simulations) there is no magic algorithm to make things more efficient. Many times the best you can do is O(N), O(N^2) or even O(2^N). In these cases, Python runs out of steam real fast (in fact any language runs out of steam fast with O(2^N) algorithms). This is especially pronounced in games where you have to perform operations over large arrays once per frame. This is typical in 3D graphics, physics and particle systems. What you must often do in such cases is move the inner loop of these operations into native code. The easiest way to do this is to find a native library that already performs the operations you need. This could be a generic library (like numpy) or something more specific (like pygame or ode). This is especially easy if the library already has a Python binding, if not you can make one using ctypes, pyrex or C directly. The downside here is the additional dependancies. If nothing exists that does what you want, you'll need to write some native code yourself. First identify the slow code. Make sure there is no way to make it fast enough within Python (is there a better algorithm? etc). Then move this code into an extension, written however you prefer. Note it is fairly easy to create an extension module that defines some functions in C that can be imported into Python (especially if they just do number crunching), the most complex aspect is the memory management. Tools like pyrex can automate that for you, but don't give you as close control over the code. But before you do any of this, do some profiling (run your game under 'python -m cProfile') and see what's taking up the time. How much time per frame do you have for game logic? Is your logic too slow or is the drawing too slow? Honestly, my biggest problem with pygame is not python being too slow, but the fill rate of SDL being too slow. I can solve the former with optimization and extensions, but the latter requires much more compromise (lower res, smaller sprites, etc). The hard reality is that the CPU is often too slow for graphics work no matter what language you use (even highly tuned SIMD asm code). That's why we have dedicated graphics hardware designed for the task. Right now SDL doesn't really take advantage of that, and that's a big limitation. -Casey
Re: [pygame] Python and Speed
On Apr 17, 2008, at 11:59 AM, Ian Mallett wrote: [..] This is precisely the problem I have run into in one of my in-dev games--iterating over large arrays once per frame. Actually, it is basically a collision detection algorithm--I have two arrays, both containing 3D points. The points in one array must be tested with the points in the other to see how close they are. If they are close enough, there is a collision. Naturally, this means that for every point in one array, the other array must be iterated through and the 3D pythagorean theorem performed to each tested point. Note this is not the most efficient way to do this, using a partitioned space you may be able to avoid comparing most points with one another most of the time. To do this in 2D you could use quad- trees, in 3D you could use oct-trees. See: http://en.wikipedia.org/wiki/Octree The easiest way to do this is to find a native library that already performs the operations you need. This seems like a rather uncommon function--I haven't found that which I need. Note ode already implements efficient 3D collision detection in naive code, I believe pymunk does this for 2D. pyode is a python wrapper for ode. FWIW, you would get better answers to your problems by asking more specific questions. If you had asked How do I make collision detection faster? you would have gotten much better answers than asking How do I make Python faster?. -Casey
Re: [pygame] Python and Speed
On Apr 17, 2008, at 12:15 PM, Casey Duncan wrote: Note ode already implements efficient 3D collision detection in naive code, I believe pymunk does this for 2D. pyode is a python wrapper for ode. heh, I meant to say native code 8^) -Casey
Re: [pygame] Python and Speed
On Thu, Apr 17, 2008 at 12:15 PM, Casey Duncan [EMAIL PROTECTED] wrote: Note this is not the most efficient way to do this, using a partitioned space you may be able to avoid comparing most points with one another most of the time. To do this in 2D you could use quad-trees, in 3D you could use oct-trees. See: http://en.wikipedia.org/wiki/Octree Yes, I've tried this, but there are issues with points being in two separate places. For example, if the collision radius is 5, and it is 3 away from the edge, then all the points in the neighboring trees must be tested. http://en.wikipedia.org/wiki/Octree Note ode already implements efficient 3D collision detection in naive code, I believe pymunk does this for 2D. pyode is a python wrapper for ode. I'll look into it. FWIW, you would get better answers to your problems by asking more specific questions. If you had asked How do I make collision detection faster? you would have gotten much better answers than asking How do I make Python faster?. Well, my question was actually how can Python be made faster? The collision detection was an example of where it is a problem. -Casey Ian
Re: [pygame] Python and Speed
On Wed, Apr 16, 2008 at 7:30 PM, Ian Mallett [EMAIL PROTECTED] wrote: Thus it falls to you as a developer to choose your implementation strategy wisely: But again, this is treating the symptoms, not the problem... I actually think the line of thinking I read in the comment above (thinking that I shouldn't have to optimize things, because the stupid language is slow) is in fact a counterproductive attitude in application developement, and misses the real point. It would be nice of course if python ran much much faster, but it runs slow largely because it is designed to give you the flexibility to code complex things much easier. If you don't want that flexibility, then you need to turn it off with pyrex and extensions and all that kind of stuff. However sometimes that flexibility actually lets you code more efficient approaches to begin with. Ultimately all slowness is the programmers problem, not the tools. If a particular tool is the best to help you solve the problem, then it should be used. With python, coolness is always on, so it's cheap to use coolness. C++ was designed to make you not pay for anything you don't use, which means coolness is default off, which means it's really hard to use coolness. ...to get to brass tacks though, I've found that the majority of the real slowness in _anything_ I code is due to the approach I take, and much less so due to the code. For example, pairwise collision between a set of objects. If every object needs to be checked against every object, that's an n-squared problem. Get 1000 items, that's 1,000,000 collision checks. But lets say I do object partitioning, or bucketing or something where I maintain sortings of the objects in a way that lets me only check items against ones that are close to it, and I either get log(n) partitioning or maybe I get at most about 10 items per bucket (both very achieveable goals). Now it means I only do about 10,000 (10*1000) collision checks for the same real work being done. So lets say that my python collision code takes 100 times as long as my c++ collision code - that means if I do the optimization in python, I can get the python code to go just as fast as the C code without the optimization. Not only that - lets say I decide I want to step stuff up to 10,000 items with pairwise collision - now it's 100,000,000 checks vs. like say 100,000 based on the approach - now python can actually be 10 times faster. So now the issue becomes whats the cost of writing the more efficient approach in python code vs. writing the naive approach in c++ code. If you think you get enough programmer benefits from working in python to make those 2 costs equal, and the performance of either is good enough, python is the better choice. Not only that, once you've got good approaches written in python that are stable and you don't need the coolness/flexibility, it becomes much easier to port the stuff to C, or pyrex it or whatever makes it much, much faster. On Thu, Apr 17, 2008 at 11:59 AM, Ian Mallett [EMAIL PROTECTED] wrote: [casey talked about complexity] This is precisely the problem I have run into in one of my in-dev games--iterating over large arrays once per frame. Actually, it is basically a collision detection algorithm--I have two arrays, both containing 3D points. The points in one array must be tested with the points in the other to see how close they are. If they are close enough, there is a collision. Naturally, this means that for every point in one array, the other array must be iterated through and the 3D pythagorean theorem performed to each tested point. Sounds like your approach is O(N^2). If most points aren't close enough to do the collision, partitioning to make it so you don't even have to do the check will do wonders.
Re: [pygame] Python and Speed
On Apr 17, 2008, at 12:26 PM, Ian Mallett wrote: On Thu, Apr 17, 2008 at 12:15 PM, Casey Duncan [EMAIL PROTECTED] wrote:Note this is not the most efficient way to do this, using a partitioned space you may be able to avoid comparing most points with one another most of the time. To do this in 2D you could use quad-trees, in 3D you could use oct-trees. See: http://en.wikipedia.org/wiki/Octree Yes, I've tried this, but there are issues with points being in two separate places. For example, if the collision radius is 5, and it is 3 away from the edge, then all the points in the neighboring trees must be tested. Partitioned space is certainly a more complex algorithm, but so long as all of your points (spheres?) are not close together, it is usually vastly more efficient. If the partition size is optimal, than the vast majority of particles will not be hitting the edge of a partition, that will be an edge-case (pun intended). Even for those that are it's usually still faster than the naive O(N^2) method that compares every point with every other. This algorithm is only effective if the space is large relative to the collision geometries and they tend not to be clumped very close together. -Casey
Re: [pygame] Python and Speed
On Thu, Apr 17, 2008 at 12:39 PM, Brian Fisher [EMAIL PROTECTED] wrote: I actually think the line of thinking I read in the comment above (thinking that I shouldn't have to optimize things, because the stupid language is slow) is in fact a counterproductive attitude in application developement, and misses the real point. Obviously it is the problem of the programmer--it is the programmer who programs his program, not Python. Python just executes it. But the fact is, it makes a programmer's job easier if he has unlimited power to work with. Currently, I find myself having to stretch Python's limits, and, as you say, find optimizations. Programming is fun, but rewriting code so that you can meet a reasonable performance benchmark is not. It would be nice of course if python ran much much faster, but it runs slow largely because it is designed to give you the flexibility to code complex things much easier. I like this aspect of Python--its flexibility, but I object to the lack of speed. I want it all--flexibility and speed. If you don't want that flexibility, then you need to turn it off with pyrex and extensions and all that kind of stuff. I actually can't think of any situation where I would really need to do that. However sometimes that flexibility actually lets you code more efficient approaches to begin with. ...because the code is more clear. Better looking code usually runs faster, because clear code allows one to see any performance sucking bugs... Ultimately all slowness is the programmers problem, not the tools[']. Of course. The programmer is the one who makes the program. The users would complain to the programmer, not Python, and, uh, they do. If a particular tool is the best to help you solve the problem, then it should be used. With python, coolness is always on, so it's cheap to use coolness. C++ was designed to make you not pay for anything you don't use, which means coolness is default off, which means it's really hard to use coolness. ...to get to brass tacks though, I've found that the majority of the real slowness in _anything_ I code is due to the approach I take, and much less so due to the code. For example, pairwise collision between a set of objects. If every object needs to be checked against every object, that's an n-squared problem. Get 1000 items, that's 1,000,000 collision checks. But lets say I do object partitioning, or bucketing or something where I maintain sortings of the objects in a way that lets me only check items against ones that are close to it, and I either get log(n) partitioning or maybe I get at most about 10 items per bucket (both very achieveable goals). Now it means I only do about 10,000 (10*1000) collision checks for the same real work being done. This is work that must be done. To do this in my case would be somewhat complicated, as I would need interpartition testing, boundary testing on the partitions and on each point, and various other modifications. Of course the code could be made faster, but this is something that I would have to do to get this program functioning at a good speed. Why not make Python faster, making such annoying modifications unnecessary and speeding up all Python in the process? So lets say that my python collision code takes 100 times as long as my c++ collision code - that means if I do the optimization in python, I can get the python code to go just as fast as the C code without the optimization. Not only that - lets say I decide I want to step stuff up to 10,000 items with pairwise collision - now it's 100,000,000 checks vs. like say 100,000 based on the approach - now python can actually be 10 times faster. That's an optimization which takes time and effort to implement. A C programmer very often has no need to do such optimizations, though he works with code I find horrid by comparison. So now the issue becomes whats the cost of writing the more efficient approach in python code vs. writing the naive approach in c++ code. If you think you get enough programmer benefits from working in python to make those 2 costs equal, and the performance of either is good enough, python is the better choice. Not only that, once you've got good approaches written in python that are stable and you don't need the coolness/flexibility, it becomes much easier to port the stuff to C, or pyrex it or whatever makes it much, much faster. The whole point of using Python, for me, is that it is far more flexible and programmer-friendly than anything else I've seen. I don't want to have to make a choice between Python and C just on a matter of speed--Python should be the clear choice. They should be equal in speed, but Python is easier to use. Obvious choice? Python. Sounds like your approach is O(N^2). If most points aren't close enough to do the collision, partitioning to make it so you don't even have to do the check will do wonders. Again, this problem is merely an
Re: [pygame] Python and Speed
Hi! I am just making an observation on this and objects, maybe I am missing the point, but when checking collisions, if you know your objects size, the vertex, or point depending on direction, could you not not solve this by just the direct line between the 2 objects and not the surface? What I mean, you are in control of your game, you know your objects, thus, you also know the point that will be hit depending on the direction you are traveling in, so why not just check that point or small collection of points? That is what I would do when running this kind of game. For my Star Trek game does not use the actual pixel, but the area in which it is in and that is a larger area. But when using pixel points, then only the points that fall in-line with the direction you are traveling in. That seems to me to a much faster check, little to almost 1 single point is what seems to be the result in this... In other words you know your objects and where they are, then just draw a straight line between them and calculate the edge at that line drawn. I am not saying draw a line, just calculate to the edge of the object from both... Bruce On Apr 17, 2008, at 12:26 PM, Ian Mallett wrote: On Thu, Apr 17, 2008 at 12:15 PM, Casey Duncan [EMAIL PROTECTED] wrote:Note this is not the most efficient way to do this, using a partitioned space you may be able to avoid comparing most points with one another most of the time. To do this in 2D you could use quad-trees, in 3D you could use oct-trees. See: http://en.wikipedia.org/wiki/Octree Yes, I've tried this, but there are issues with points being in two separate places. For example, if the collision radius is 5, and it is 3 away from the edge, then all the points in the neighboring trees must be tested. Partitioned space is certainly a more complex algorithm, but so long as all of your points (spheres?) are not close together, it is usually vastly more efficient. If the partition size is optimal, than the vast majority of particles will not be hitting the edge of a partition, that will be an edge-case (pun intended). Even for those that are it's usually still faster than the naive O(N^2) method that compares every point with every other. This algorithm is only effective if the space is large relative to the collision geometries and they tend not to be clumped very close together. -Casey
Re: [pygame] Python and Speed
On Thu, Apr 17, 2008 at 12:45 PM, Casey Duncan [EMAIL PROTECTED] wrote: Partitioned space is certainly a more complex algorithm, but so long as all of your points (spheres?) Yes, one can think of one array as points and the other as spheres. are not close together, it is usually vastly more efficient. If the partition size is optimal, than the vast majority of particles will not be hitting the edge of a partition, that will be an edge-case (pun intended). Even for those that are it's usually still faster than the naive O(N^2) method that compares every point with every other. This algorithm is only effective if the space is large relative to the collision geometries and they tend not to be clumped very close together. In 3D space, there is a good deal of room, so I can see this being effective. But again, this is only one example of countless slowdowns I've had. Let's fix all of them in one fell swoop, no? The program is slowing down because the computer is processing the program relatively inefficiently. This is due to Python. Programmers shouldn't have to optimize their inefficiently executed code, the code should just be executed efficiently. -Casey Ian
Re: [pygame] Python and Speed
I'm not sure precisely what you mean... Again, remember that this is an example, not the question. The question is How can Python be made Faster? This is an example of one of the problems resulting from Python's relative slowness. Here's the example: -There is a list of 3D points -There is another list of 3D points. -Every frame, for every point in the first list, if any point in the second list is a certain 3D distance away, then there is a collision. Ian
Re: [pygame] Python and Speed
In that specific case, no matter which programming language you use, your code will not be very fast. Do you think programmers write unoptomized code in c, and get speedy execution every time? Have you not ever used a program or played a game which ran slower than it should, which was actually programmed in a fast language? Optiomizing code and improving algorithms have been around far longer than python, and are an important part of programming in general. As has been mentioned before, there have been many attempts to optimize core python, which have resulted in some improvements. Python2.5 is considerably faster than python1.0. However, due to the nature of the language it can only be optimized so much. The best bet really, is to write code in c that needs to be fast, and call that code from python, using python as a glue language. This can be accomplished using implementations that already exist (pyode, numpy) or writing a new implementation and exposing it with pyrex or other linking programs. There is no magic bullet that will make python faster, and it's not for lack of trying. Even at it's most optimized that could theoretically be done, I don't think pure python will ever be as fast as c. I do hope it gets close, but even if this were to be the case, your collision detection code will still be slow as heck. Culling algorithms for this purpose were invented to speed up applications written in C after all :)
Re: [pygame] Python and Speed
On Thu, 17 Apr 2008, you wrote: No, this is the place to discuss it because if we wish to make games, work with existing platforms, and want speed, that is the way to go. Now that we have had this discussion, and found solutions, now we have a list of ways to resolve it. Sure, discuss ways to work with the current interpreter, but it's quite pointless talking about changes to the interpreter itself. I'm almost certain none of the core devs are on this list, and the python-dev list is a much more appropriate place to discuss it anyway :) Richard
Re: [pygame] Python and Speed
Patrick Mullen wrote: Also, if you are using sqrt for your distance check, you are likely wasting cpu cycles, if all you need to know is whether they are close enough. Also note that if you do need to compare exact Euclidean distances for some reason, you can avoid square roots by comparing the squares of the distances instead of the distances themselves. -- Greg
Re: [pygame] Python and Speed
Ian Mallett wrote: Here's the example: -There is a list of 3D points -There is another list of 3D points. -Every frame, for every point in the first list, if any point in the second list is a certain 3D distance away, then there is a collision. The responses to it also provide a good example of a very important principle: Often, using a better algorithm can give you much greater gains than speeding up the implementation of the one you're using. So when something is too slow, the first thing you should ask is: Does there exist a better algorithm for what I'm trying to do? -- Greg
Re: [pygame] Python and Speed
René Dudfield wrote: - SIMD instructions are the fast ones... It's doubtful there's much in the Python core that would benefit from SIMD, though. Most of what it does doesn't involve doing repetitive operations on big blocks of data. -- Greg
Re: [pygame] Python and Speed
Ian Mallett wrote: How do you write an extension module in C and call it from Python? Someone gave some instructions earlier, but I found them too vague... Another way I forgot to mention earlier is to use the ctypes module (I often forget about it, because it wasn't in the stdlib until very recently.) That allows you to call compiled routines from a shared library directly without having to write any C. It's less efficient, though, as it has to go through some Python wrapper objects to get there, and also more dangerous, because you can easily crash the interpreter if you don't get everything exactly right. -- Greg
Re: [pygame] Python and Speed
Ian Mallett wrote: Yes, I've tried this, but there are issues with points being in two separate places. For example, if the collision radius is 5, and it is 3 away from the edge, then all the points in the neighboring trees must be tested. Rather than a tree, you may be just as well off using a regular array of cells. That makes it much easier to find the neighbouring cells to test, and there's also less overhead from code to manage and traverse the data structure. The only time you would really need a tree is if the distribution of the objects can be very clumpy, so that you benefit from an adaptive subdivision of the space. Another possibility to consider is instead of testing neighbouring cells, insert each object into all cells that are within the collision radius of it. That might turn out to be faster if the objects don't move very frequently. -- Greg
Re: [pygame] Python and Speed
ctypes is great stuff! I find it much harder to crash the interpreter with ctypes than with extensions I've developed and debugged. It is quite resilient. I've used it to interface with the Windows API to simulate keystrokes, to interface to a USB Digital IO interface, in a wrapper for the huge OpenCV library, to set the background image on my desktop, to adjust the system volume control, to interface to the Wiimote, and to wrap eSpeak to name a few. I've pretty much given up on swig and pyrex. gb Greg Ewing wrote: Ian Mallett wrote: How do you write an extension module in C and call it from Python? Someone gave some instructions earlier, but I found them too vague... Another way I forgot to mention earlier is to use the ctypes module (I often forget about it, because it wasn't in the stdlib until very recently.) That allows you to call compiled routines from a shared library directly without having to write any C. It's less efficient, though, as it has to go through some Python wrapper objects to get there, and also more dangerous, because you can easily crash the interpreter if you don't get everything exactly right.
Re: [pygame] Python and Speed
Hi! Another thought, same as before but adding the other comment about bins. If your object is heading in a certain direction and you know the surface point of that object, now make a dictionary key point for it. Same for all other objects, knowing there direction. key=str(x)+str(-y)+str(-z) #Keeping all as integer values in a coordinate grid. The only thing you compare are the keys. Then you will say, wait a minute, not both are going to collide at that point! OK, then you know the direction of both and the point where they would in fact collide. Now that point will be approaching both at the same point in that straight line. The vertex of the point may change where they intersect, but keep taking the key value of that point because both will always have that vertex point on that point on the surface. Keep updating the dictionary key and compare the value for both with the if key in... That point is a key value and easy to check. It is not the surface, just the intersection point of both straight lines of the object traveling through free space... So you will be checking 2 keys for 2 objects, 3 keys for 3 objects, and so on... The only addition to this is if you have an object that is not perfect shape, like a boulder, where that outer edge will change depending on the angle of your straight line to the vertex or intersection point. Not checking surfaces, just checking the intersection point of the line, for both will have to meet and the distance to that objects surface will also be that straight line distance for the object center, which will still direct you to the vertex/intersection point. Both surface points that meet, have same key value, will also be the collision point when they do in fact collide... So all points will match, just another observation and thought in this faster and faster check of objects. For I do something like this with my primitive Battleship game. Instead of an array, just keys for where an object part is located. When the missile or shell hits that key it matches the objects location at that same key. No completely filled huge array, just points. When doing this you have up to 3 points to calculate: point of surface of object 1, point on surface of object 2 and then the intersection point of the vector of both objects. When all 3 points match with the same key, you have your collision! Like I said before, the updating is the vector angle, surface location and the intersection point of that straight line of the vectors of both objects. Bruce René Dudfield wrote: - SIMD instructions are the fast ones... It's doubtful there's much in the Python core that would benefit from SIMD, though. Most of what it does doesn't involve doing repetitive operations on big blocks of data. -- Greg
Re: [pygame] Python and Speed
Ian Mallett wrote: Programmers shouldn't have to optimize their inefficiently executed code, the code should just be executed efficiently. Even if your inefficient algorithm is being executed as fast as possible, it's still an inefficient algorithm, and you will run into its limitations with a large enough data set. Then you will have to find a better algorithm anyway. Part of the skill of being a good programmer is having the foresight to see where such performance problems are likely to turn up further down the track, and choosing an algorithm at the outset that at least isn't going to be spectacularly bad. Doing this saves you work in the long run, since you spend less time going back and re-coding things. -- Greg
Re: [pygame] Python and Speed
Devon Scott-Tunkin wrote: would you set out 2d partitioning the screen in pygame by making say 4 invisible sprites with rects ... or by just using x,y values Just use coordinates. Sprites are totally unnecessary, since they're not something that appears on the screen. -- Greg
Re: [pygame] Python and Speed
On Thu, Apr 17, 2008 at 3:03 PM, Greg Ewing [EMAIL PROTECTED] wrote: Patrick Mullen wrote: Also, if you are using sqrt for your distance check, you are likely wasting cpu cycles, if all you need to know is whether they are close enough. Nope; in this case, the calculations' margin of error must be very small. Also note that if you do need to compare exact Euclidean distances for some reason, you can avoid square roots by comparing the squares of the distances instead of the distances themselves. I already do that. On Thu, Apr 17, 2008 at 4:02 PM, Greg Ewing [EMAIL PROTECTED] wrote: Rather than a tree, you may be just as well off using a regular array of cells. That makes it much easier to find the neighbouring cells to test, and there's also less overhead from code to manage and traverse the data structure. I meant that. What is the difference between a tree and a cell? Cells are regular? Anyway, I had planned to do cells. The only time you would really need a tree is if the distribution of the objects can be very clumpy, so that you benefit from an adaptive subdivision of the space. Another possibility to consider is instead of testing neighbouring cells, insert each object into all cells that are within the collision radius of it. That might turn out to be faster if the objects don't move very frequently. I like that idea. Still, the objects can and do move. On Thu, Apr 17, 2008 at 4:23 PM, Greg Ewing [EMAIL PROTECTED] wrote: There are an extremely large number of modifications that could be made to Python. Only a very small number of them will result in any improvement, and of those, all the easy-to-find ones have already been found. The harder ones must then be attacked--solving a difficult speed issue might save the tiresome implementation of optimizations on the part of hundreds of users. If you want to refute that, you're going to have to come up with an actual, specific proposal, preferably in the form of a code patch together with a benchmark that demonstrates the improvement. If you can't do that, you're not really in a position to make statements like it can't be that hard. Like I said, because I am the programmer, not the Python modifier, it is my job to make the programs run fast. By can't be that hard, I mean that if C++ can do it, Python should be able to too. Obviously, I don't know how Python is structured, and I doubt I have the experience of the people on the Python team, but if I can make optimizations in my code, they should be able to make modifications in Python. If wishing could make it so, Python would already be blazingly fast! Ian is wishing... On Thu, Apr 17, 2008 at 4:32 PM, Greg Ewing [EMAIL PROTECTED] wrote: Even if your inefficient algorithm is being executed as fast as possible, it's still an inefficient algorithm, and you will run into its limitations with a large enough data set. Then you will have to find a better algorithm anyway. Part of the skill of being a good programmer is having the foresight to see where such performance problems are likely to turn up further down the track, and choosing an algorithm at the outset that at least isn't going to be spectacularly bad. Doing this saves you work in the long run, since you spend less time going back and re-coding things. Not necessarily. I've had situations where I've decided to do something, then draft-coded it, then decided that the game feature wasn't necessary, was the wrong approach, or simply to scrap the entire project. If I had decided to spend the time to code something with all of the optimizations, it would take longer. When I delete it, that extra effort would have been worthless. Once a feature is implemented, tested, and so on, I decide if it is needed, then add optimizations and comments. Because optimizations often rewrite code in a more efficient way, the original code works as a guideline for the operation. All this saves me effort in the long-run; but I can't speak for anyone else. Greg Ian
Re: Re: [pygame] Python and Speed
Ian Mallett [EMAIL PROTECTED] wrote: ... if C++ can do it, Python should be able to too. Obviously, I don't know how Python is structured ... Then please learn more about CPython's internals and the general problem of optimising a dynamic language like Python. The CPython code is incredibly well-written and easy to understand, even with the various complications that exist due to current optimisations* and there's plenty of academic research into dynamic languages out there. I'm sure the core devs would welcome your patches with open arms! I'll leave you with Greg's wisdom, which perhaps needs repeating: If wishing could make it so, Python would already be blazingly fast! Richard * for example the double dict-lookup implementations that seamlessly replace the string-only lookup with an arbitrary-object lookup on the first non-string lookup, or the function-call frame caching mechanism that I had a hand in implementing...
Re: [pygame] Python and Speed
On Wed, Apr 16, 2008 at 6:43 PM, Dan Krol [EMAIL PROTECTED] wrote: Are you familiar with the handful of ways to optimize parts of your code? I've used Psyco to great effect, but all of these extra modules seem to be treating the symptoms, not the problem. The problem is flat-out and final--Python is slow, especially compared to C languages. Swig is the classic one I think, it wraps some C source into a python module. Pyrex does a similar thing, but it lets you wrap it yourself with a python-esque language; it lets you use python types and C types within the same pyrex file, so you have control over when things get converted between them. Cython is Pyrex plus more features, but they're less conservative than python as far stability (I think). Though their opinion is that pyrex is too conservative. Check it out. I think Boost somehow does it too, you'll have to look that up. I should look into these for the time being. I've never really used any of these other than to test it out though. Thanks, Ian
Re: [pygame] Python and Speed
Ian Mallett wrote: I feel like Python is living down to its namesake's pace... Ah yes, the speedy Monty Python.
Re: [pygame] Python and Speed
On Wed, Apr 16, 2008 at 6:59 PM, Aaron Maupin [EMAIL PROTECTED] wrote: Ah yes, the speedy Monty Python. I love Monty Python! I was referring to the general lethargic natural of the snake. Ian
Re: [pygame] Python and Speed
hi, Each release of python gets a little faster... but not massively. It really needs to get 10-20x faster - but generally only gets up to 1.2x faster with each release. There's also work on things like pypy - which might one day be quite fast. I think pypy will drive Cpython to get faster through competition - and ideas. An example of this recently happening is the method cache (which I think the idea actually came from tinypy...). The method cache was shown to work well with pypy, and then Cpython added the idea. If pypy becomes faster, then I think the Cpython people will try harder to make Cpython faster too. However mainly it's good to try and make highly reusable, and fast basic building blocks - and then glue them together with python. For example, if you see something that most pygame games would get faster with, then add it to pygame. Or to SDL, or to python. If the drawing part of the game takes 10% less time, that leaves 10% of time for game code. As examples in the last pygame release - The pygame.sprite.LayeredDirty sprite work, the threading work, and modifications to some functions to allow reusing surfaces(eg transform.scale) in the last pygame should make a lot of pygame games quicker. For SDL the blitters have been optimized with mmx, and altivec assembly - and the upcoming SDL 1.3 can optionally use opengl, and direct3d hardware acceleration. Also the included PixelArray should allow you to do a lot of things quicker - and you can rely on it to be included with pygame(unlike numeric/numpy). We hope to have fast vector, and matrix types included at some point in the future too. If you've got any ideas for reusable speed ups - we'll gladly consider them in pygame. cheers, On Thu, Apr 17, 2008 at 11:36 AM, Ian Mallett [EMAIL PROTECTED] wrote: Recently, I've been having issues in all fortes of my programming experience where Python is no longer fast enough to fill the need adaquately. I really love Python and its syntax (or lack of), and all the nice modules (such as PyGame) made for it. Are there any plans to improve Python's speed to at least the level of C languages? I feel like Python is living down to its namesake's pace... Ian
Re: [pygame] Python and Speed
Ian Mallett [EMAIL PROTECTED] wrote: Are there any plans to improve Python's speed to at least the level of C languages? This isn't really the best forum for asking such a question. I would recommend asking on the general Python mailing list / newsgroup (comp.lang.python on http://www.python.org/community/lists/). I think I speak for all Python developers when I say that we'd love for the language to run faster. And of course the large body of core CPython developers are aware of this. I've personally attended a sprint in Iceland during which we spent a week solely focused on speeding up the CPython interpreter. There's just not much that can be done with the current CPython implementation to make it faster. Thus it falls to you as a developer to choose your implementation strategy wisely: 1. pick sensible libraries that handle large amounts of processing for you (whether that be numeric or graphic) 2. where there is no existing library, you may need to code speed-critical parts of your application using C, or the more programmer-friendly Pyrex (er, Cython these days I believe :) Richard
Re: [pygame] Python and Speed
Hi Ian, I think what you are saying and I agree, is that when someone has fixed something by going back to C code, then why not make a module for that code. Thus all you do is insert the C code using a Python/Pygame module name... But slowing down is when it uses the Python interpreter, but why not the C interpreter? Or make Python code that uses that format, but runs under the C interpreter? After all, it is all about ease in writing, higher level language using the lower level code under just a different name for translation, but normal C code once interpreted or translated... Bruce Ian Mallett [EMAIL PROTECTED] wrote: Are there any plans to improve Python's speed to at least the level of C languages? This isn't really the best forum for asking such a question. I would recommend asking on the general Python mailing list / newsgroup (comp.lang.python on http://www.python.org/community/lists/). I think I speak for all Python developers when I say that we'd love for the language to run faster. And of course the large body of core CPython developers are aware of this. I've personally attended a sprint in Iceland during which we spent a week solely focused on speeding up the CPython interpreter. There's just not much that can be done with the current CPython implementation to make it faster. Thus it falls to you as a developer to choose your implementation strategy wisely: 1. pick sensible libraries that handle large amounts of processing for you (whether that be numeric or graphic) 2. where there is no existing library, you may need to code speed-critical parts of your application using C, or the more programmer-friendly Pyrex (er, Cython these days I believe :) Richard -- No virus found in this incoming message. Checked by AVG. Version: 7.5.519 / Virus Database: 269.23.0/1381 - Release Date: 4/16/2008 9:34 AM
Re: [pygame] Python and Speed
On Wed, Apr 16, 2008 at 7:46 PM, FT [EMAIL PROTECTED] wrote: Hi Ian, I think what you are saying and I agree, is that when someone has fixed something by going back to C code, then why not make a module for that code. Thus all you do is insert the C code using a Python/Pygame module name... So I have a C file TheWeirdAndBoringNameForATitleBecauseICantThinkOfAGoodModuleNameRightNow.c and I go: import TheWeirdAndBoringNameForATitleBecauseICantThinkOfAGoodModuleNameRightNow #later TheWeirdAndBoringNameForATitleBecauseICantThinkOfAGoodModuleNameRightNow.#function ? But slowing down is when it uses the Python interpreter, but why not the C interpreter? Or make Python code that uses that format, but runs under the C interpreter? After all, it is all about ease in writing, higher level language using the lower level code under just a different name for translation, but normal C code once interpreted or translated... All that is very true, particularly After all, it is all about ease in writing, higher level language using the lower level code under just a different name. I could certainly live with keeping Python based on C as long as it is still fast, like with a C interpreter or what-not, though it will mean Python's continued dependence on C and no chance for competition... Ian
Re: [pygame] Python and Speed
Ian Mallett wrote: Why not? If C is faster, why can't Python be equally so? If we assume that C is the fastest a language available, then that implies that Python /could/ be faster should it adopt the same procedures as C. But adopting the same procedures as C would mean becoming a statically-typed low-level language -- losing most of the things that make Python nicer to program in than C! C is fast because it's designed to be easily translatable into very efficient machine code. When a C compiler sees int a, b, c; a = b + c; it knows that a, b and c are all integers that fit in a machine word, and it knows their memory addresses, so it generates about 2 or 3 instructions to do the job. However, when a Python implementation sees a = b + c it has no idea what types of objects b and c will refer to at run time. They might be ints, they might be long ints that don't fit in a machine word, they might be strings, they might be some custom object of your own with an __add__ method. And they may be different for different executions of that statement. So every time the statement is executed, the names b and c need to be looked up (namespaces can dynamically gain or lose names, so we can't assume they're always at a particular address), examine the types of the objects they refer to, figure out what operation needs to be done, and carry it out. Doing all that takes a great many more machine instructions than adding a couple of ints in C. There are various ways of addressing the speed issue with dynamic languages. One is what's known as type inferencing, where the compiler examines the whole program, thinks about it very hard, and tries to convince itself that certain variables can only ever hold values of certain types; specialised code is then generated based on that. This works best in languages that are designed for it, e.g. Haskell; applying it post-hoc to a dynamic language such as Python tends to be much harder and not work so well. Another is to use just-in-time techniques, where you look at what types are actually turning up at run time and generated specialised code for those cases. Psycho is an attempt to apply this idea to Python; reportedly it can produce useful speedups in some cases. There's a project around called PyPy which is using type inferencing and various other strange and wonderful techniques in an attempt to produce a self-hosted Python implementation, i.e. written entirely in Python. Last I heard, it was still somewhat slower than CPython. I've gathered that Python is based on C code, so it translates your code into C code on the fly. No, it doesn't translate it into C, it just executes it directly. There is a translation step of sorts, into so-called bytecodes, which are instructions for a virtual machine. But the instructions are very high-level and correspond almost one-for-one with features of the Python language. The interpreter then executes these virtual instructions. The CPython interpreter happens to be written in C, but it could have been written in any language that can be compiled to efficient machine code, and the result would be much the same. How can you run a C file from a Python script? You can't, not directly. You need to write what's known as an extension module, which is a wrapper written in C that bridges between the worlds of Python objects and C data types. You then compile this into a dynamically linked object file, which can be imported as though it were a Python module. Writing an extension module by hand is rather tedious and not for the faint of heart. It has to handle all conversion between Python objects and C data, manage the reference counts of all Python objects it deals with, and be scrupulous about checking for errors and reporting them. If you want to get an idea of what's involved, have a look at the Extending and Embedding and Python/C API sections of the Python documentation. Fortunately, there are a variety of tools available to make the task easier, such as SWIG, Boost Python (for C++), Pyrex and Cython. If you want to try this, my personal recommendation would be Pyrex, although this is not exactly an unbiased opinion, given that I wrote it. :-) http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ -- Greg
Re: [pygame] Python and Speed
On Wed, Apr 16, 2008 at 9:10 PM, Greg Ewing [EMAIL PROTECTED] wrote: There are various ways of addressing the speed issue with dynamic languages. One is what's known as type inferencing, where the compiler examines the whole program, thinks about it very hard, and tries to convince itself that certain variables can only ever hold values of certain types; specialised code is then generated based on that. This works best in languages that are designed for it, e.g. Haskell; applying it post-hoc to a dynamic language such as Python tends to be much harder and not work so well. I realised this problem when I first thought of it, but I didn't know it had already been tried. Oh well. Another is to use just-in-time techniques, where you look at what types are actually turning up at run time and generated specialised code for those cases. Psycho is an attempt to apply this idea to Python; reportedly it can produce useful speedups in some cases. Psyco. I have used it to great effect. For example, the speed of the particles demo in my PAdLib is almost completely due to Psyco. Indeed, it approximately doubles the demo's speed--or I just give it twice as many particles to chew on. No, it doesn't translate it into C, it just executes it directly. There is a translation step of sorts, into so-called bytecodes, which are instructions for a virtual machine. But the instructions are very high-level and correspond almost one-for-one with features of the Python language. The interpreter then executes these virtual instructions. The CPython interpreter happens to be written in C, but it could have been written in any language that can be compiled to efficient machine code, and the result would be much the same. How can you run a C file from a Python script? You can't, not directly. You need to write what's known as an extension module, which is a wrapper written in C that bridges between the worlds of Python objects and C data types. You then compile this into a dynamically linked object file, which can be imported as though it were a Python module. Writing an extension module by hand is rather tedious and not for the faint of heart. It has to handle all conversion between Python objects and C data, manage the reference counts of all Python objects it deals with, and be scrupulous about checking for errors and reporting them. If you want to get an idea of what's involved, have a look at the Extending and Embedding and Python/C API sections of the Python documentation. Fortunately, there are a variety of tools available to make the task easier, such as SWIG, Boost Python (for C++), Pyrex and Cython. If you want to try this, my personal recommendation would be Pyrex, although this is not exactly an unbiased opinion, given that I wrote it. :-) Like I said, these are all good options I should look into. http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ -- Greg Ian
Re: [pygame] Python and Speed
René Dudfield wrote: 2. - asm optimizations. There seems to be almost no asm optimizations in CPython. That's a deliberate policy. One of the goals of CPython is to be very portable and written in a very straightforward way. Including special pieces of asm for particular architectures isn't usually considered worth the maintenance effort required. CPython could use faster threading primitives, and more selective releasing of the GIL. Everyone would love to get rid of the GIL as well, but that's another Very Hard Problem about which there has been much discussion, but little in the way of workable ideas. A way to know how much memory is being used. Memory profiling is the most important way to optimize since memory is quite slow compared to the speed of the cpu. Yes, but amount of memory used doesn't necessarily have anything to do with rate of memory accesses. Locality of reference, so that things stay in the cache, is more important. perhaps releasing a patch with a few selected asm optimizations might let the python developers realise how much faster python could be... Have you actually tried any of this? Measurement would be needed to tell whether these things address any of the actual bottlenecks in CPython. a slot int attribute takes up 4-8 bytes, whereas a python int attribute takes up (guessing) 200 bytes. Keep in mind that the slot only holds a reference -- the actual int object still takes up memory elsewhere. Slots do reduce memory use somewhat, but I wouldn't expect that big a ratio. -- Greg