Hi, that's good :)
SDL does it by compiling the mmx stuff in if the compiler supports it. Then it has runtime checks to see if the cpu supports it. SDL also has a configure flag, which you can use to tell it not to even try compiling mmx stuff. So if the compiler doesn't support it, the C version is used. If the compiler supports it, and the runtime cpu detection finds mmx, then the mmx version is used. On 6/20/07, Richard Goedeken <[EMAIL PROTECTED]> wrote:
Rene, I would be willing to add the CPU detection functions but I can't think of how it could be implemented in a useful way. The compile-time checks have to stay in because trying to compile the 64-bit code on a 32-bit architecture, or the 32-bit MMX code on a PPC or similar, will cause compile-time errors. So it's given that if someone is running an i386, PPC, Sun, Arm, etc, they will get the C code. If they're running i686, they'll get the 32-bit MMX, and for x86_64 they'll get the 64-bit MMX. So, the dilemma is that whatever build a user is running, the code is pretty much guaranteed to work on their CPU. If someone is running the i686 build on a 486 or something silly like that they'll probably have bigger problems. We could allow someone to 'downgrade' and run the C code when their CPU supports MMX, but what's the point? Regards, Richard René Dudfield wrote: > Nice one! > > This sounds like a very nice scaling function. > > It'd be cool if we could include a run time way of including mmx and > other cpu specific optimizations. Probably using the SDL methods would > be the way to go. > > I've added it to the todo list for this weeks mini sprint. > http://www.pygame.org/wiki/todo So hopefully it'll get into pygame soon. > > If you feel like figuring out how to use the SDL mmx detection routines > to select the mmx routine at runtime, that'd be cool. > > > On 6/18/07, *Richard Goedeken* <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > Hello everyone. I just joined the list; My name is Richard Goedeken. > I'm using Pygame in a project that I've been working on for a few weeks, > and I wanted an image scaling function with higher visual quality than > the nearest-neighbor algorithm which is included with the 'scale' > function. So I wrote one; it's in the attached zip file. I hereby give > the Pygame maintainers permission to include and distribute this code > with the Pygame project under the license of their choice. > > The algorithm which I've implemented is interesting. Each axis is > scaled independently, which gives it the property that scaling an image > only in the X dimension or only in the Y dimension will be about twice > as fast as scaling both. The reason that this design was chosen is > because the axes are scaled differently depending upon whether they are > being shrunk or expanded. For expansion, a bilinear filter is used > which looks nice at magnifications under 3x or so and is quick. For > shrinking the image, a novel area-averaging algorithm is used which > suppresses Moire patterns and looks good even at very small sizes. > > The source code is in transform.c. It's pretty big because I've also > included inline MMX routines for the i686 and x86_64 architectures under > Unix. The AT&T-style asm sytax won't work with the Intel or MS > compilers, but someone could translate it and add Intel-style code for > Win32. It runs a lot faster with the MMX code. I have included a test > program (scaletest.py) which can run a short benchmark series of scaling > operations. When run with a 600k pixel image, I got the following > results: > > Machine Algorithm Code level Shrink time Expand time > Athlon64 3800+ smoothscale C-only 36 ms 96 ms > Athlon64 3800+ smoothscale 64-bit MMX 5 ms 16 ms > Athlon64 3800+ scale C-only 2 ms 13 ms > Pentium 3-800 smoothscale C-only 64 ms 180 ms > Pentium 3-800 smoothscale 32-bit MMX 39 ms 119 ms > Pentium 3-800 scale C-only 17 ms 85 ms > > I was surprised that the MMX ran so much (6x) faster than the C-code on > my 64-bit machine. But I'm happy that it actually comes close to > matching the nearest-neighbor 'scale' function. I think the P-3 may > have been hindered by relatively low memory bandwidth. With newer > 32-bit architectures such as the Core 2 or Athlon I believe that the MMX > will give a bigger speed gain over the C than the P-3. > > The 'config.py' file is also modified to set CFLAGS to activate the > inline assembly code. I've integrated this new function into my project > system, and it's quite a nice visual upgrade. I'm sure there are a lot > of people who could use a relatively fast smooth scaling algorithm in > the pygame software, so enjoy! > > Richard > > >