Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)

2007-06-21 Thread Laura Creighton
In a message of Thu, 21 Jun 2007 09:18:41 +0200, [EMAIL PROTECTED] writes:
>Laura Creighton <[EMAIL PROTECTED]>:
>
>>
>> If you are on a BSD or linux system you will have a file
>> /proc/cpuinfo which can tell you what sort of CPU you have and
>
>FreeBSD does not use /proc anymore and has it disabled by default.
>
>> whether it has MMX support or not.  I don't know where Windows
>> keeps such information.  (And really old unix-and-unix-like
>> systems don't have this file, but they don't have MMX either,
>> so you are all set.)
>
>For !Win32 systems we could rely on the -march settings of the processor.
>According to the manual, the GCC specifies MMX and SSE/SSE2/SSE3 for
>several archs, so we just have to test on them. For Win32 I just know abo
>ut
>the CPUID hacks, but that's only interesting for runtime checks.
>
>Regards
>Marcus

Aha, I did not know that.  Thank you. According to:
http://lists.freebsd.org/pipermail/freebsd-questions/2006-March/117134.html

sysctl -a 

will give you the same information on Freebsd systems.  Just in case
anybody cares because setting the -march settings looks to accomplish
the same thing, and is a lot easier.

Laura




Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)

2007-06-21 Thread René Dudfield

Your idea is a good one.  Otherwise we could write a minimal test
program that contains some mmx, SSE etc.  So that we try to compile it
as part of the config step.  If the code compiles ok, then it is
supported.

Note, that I use gcc(mingw) for compiling pygame on windows - and I
think mostly the asm code will be gcc specific (unless someone wants
to port it).  So for when people use MSVC the code will be #ifdef'd
out with some checks for the gcc compiler.

SDL uses cpuid stuff for run time detection, so it makes sense to use that code.

We should also look to see how SDL does compilation for osx, which can
optionally support mmx for when it runs on intel processors.


On 6/21/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

Laura Creighton <[EMAIL PROTECTED]>:

>
> If you are on a BSD or linux system you will have a file
> /proc/cpuinfo which can tell you what sort of CPU you have and

FreeBSD does not use /proc anymore and has it disabled by default.

> whether it has MMX support or not.  I don't know where Windows
> keeps such information.  (And really old unix-and-unix-like
> systems don't have this file, but they don't have MMX either,
> so you are all set.)

For !Win32 systems we could rely on the -march settings of the processor.
According to the manual, the GCC specifies MMX and SSE/SSE2/SSE3 for
several archs, so we just have to test on them. For Win32 I just know about
the CPUID hacks, but that's only interesting for runtime checks.

Regards
Marcus




Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)

2007-06-21 Thread mva

Laura Creighton <[EMAIL PROTECTED]>:



If you are on a BSD or linux system you will have a file
/proc/cpuinfo which can tell you what sort of CPU you have and


FreeBSD does not use /proc anymore and has it disabled by default.


whether it has MMX support or not.  I don't know where Windows
keeps such information.  (And really old unix-and-unix-like
systems don't have this file, but they don't have MMX either,
so you are all set.)


For !Win32 systems we could rely on the -march settings of the processor.
According to the manual, the GCC specifies MMX and SSE/SSE2/SSE3 for
several archs, so we just have to test on them. For Win32 I just know about
the CPUID hacks, but that's only interesting for runtime checks.

Regards
Marcus



Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)

2007-06-20 Thread Laura Creighton

If you are on a BSD or linux system you will have a file
/proc/cpuinfo which can tell you what sort of CPU you have and
whether it has MMX support or not.  I don't know where Windows
keeps such information.  (And really old unix-and-unix-like
systems don't have this file, but they don't have MMX either,
so you are all set.)

Laura

In a message of Wed, 20 Jun 2007 19:30:07 EDT, Richard Goedeken writes:
>That's a good plan, but I haven't been able to find a good way to
>determine at compile time whether or not the compiler supports MMX.  You
>could always dump out a small test file and compile it to test for
>errors, but I was hoping to find something better like running gcc with
>a magical parameter or testing for a magical preproc definition.  Are
>you aware of any 'cleaner' ways of handling this detection?
>
>Richard
>
>
>René Dudfield wrote:
>> Hi,
>> 
>> that's good :)
>> 
>> SDL does it by compiling the mmx stuff in if the compiler supports it.
>> Then it has runtime checks to see if the cpu supports it.  SDL also
>> has a configure flag, which you can use to tell it not to even try
>> compiling mmx stuff.
>> 
>> So if the compiler doesn't support it, the C version is used.
>> If the compiler supports it, and the runtime cpu detection finds mmx,
>> then the mmx version is used.
>> 
>> 
>> 
>> On 6/20/07, Richard Goedeken <[EMAIL PROTECTED]> wrote
>:
>>> Rene,
>>>
>>> I would be willing to add the CPU detection functions but I can't thin
>k
>>> of how it could be implemented in a useful way.  The compile-time chec
>ks
>>> have to stay in because trying to compile the 64-bit code on a 32-bit
>>> architecture, or the 32-bit MMX code on a PPC or similar, will cause
>>> compile-time errors.  So it's given that if someone is running an i386
>,
>>> PPC, Sun, Arm, etc, they will get the C code.  If they're running i686
>,
>>> they'll get the 32-bit MMX, and for x86_64 they'll get the 64-bit MMX.
>>>
>>> So, the dilemma is that whatever build a user is running, the code is
>>> pretty much guaranteed to work on their CPU.  If someone is running th
>e
>>> i686 build on a 486 or something silly like that they'll probably have
>>> bigger problems.  We could allow someone to 'downgrade' and run the C
>>> code when their CPU supports MMX, but what's the point?
>>>
>>> Regards,
>>> Richard


Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)

2007-06-20 Thread René Dudfield

Hi,

I guess we could use python platform stuff.  Like how much stuff is
done in the python config files at the moment.  We put '-enable-mmx'
or whatever it is called into each platform that we know supports it.
mingw, windows, *nix with x86, osx.

Not quite as good as a gcc flag, but I think it'll work.




On 6/21/07, Richard Goedeken <[EMAIL PROTECTED]> wrote:

That's a good plan, but I haven't been able to find a good way to
determine at compile time whether or not the compiler supports MMX.  You
could always dump out a small test file and compile it to test for
errors, but I was hoping to find something better like running gcc with
a magical parameter or testing for a magical preproc definition.  Are
you aware of any 'cleaner' ways of handling this detection?

Richard


René Dudfield wrote:
> Hi,
>
> that's good :)
>
> SDL does it by compiling the mmx stuff in if the compiler supports it.
> Then it has runtime checks to see if the cpu supports it.  SDL also
> has a configure flag, which you can use to tell it not to even try
> compiling mmx stuff.
>
> So if the compiler doesn't support it, the C version is used.
> If the compiler supports it, and the runtime cpu detection finds mmx,
> then the mmx version is used.
>
>
>
> On 6/20/07, Richard Goedeken <[EMAIL PROTECTED]> wrote:
>> Rene,
>>
>> I would be willing to add the CPU detection functions but I can't think
>> of how it could be implemented in a useful way.  The compile-time checks
>> have to stay in because trying to compile the 64-bit code on a 32-bit
>> architecture, or the 32-bit MMX code on a PPC or similar, will cause
>> compile-time errors.  So it's given that if someone is running an i386,
>> PPC, Sun, Arm, etc, they will get the C code.  If they're running i686,
>> they'll get the 32-bit MMX, and for x86_64 they'll get the 64-bit MMX.
>>
>> So, the dilemma is that whatever build a user is running, the code is
>> pretty much guaranteed to work on their CPU.  If someone is running the
>> i686 build on a 486 or something silly like that they'll probably have
>> bigger problems.  We could allow someone to 'downgrade' and run the C
>> code when their CPU supports MMX, but what's the point?
>>
>> Regards,
>> Richard



Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)

2007-06-20 Thread Richard Goedeken
That's a good plan, but I haven't been able to find a good way to
determine at compile time whether or not the compiler supports MMX.  You
could always dump out a small test file and compile it to test for
errors, but I was hoping to find something better like running gcc with
a magical parameter or testing for a magical preproc definition.  Are
you aware of any 'cleaner' ways of handling this detection?

Richard


René Dudfield wrote:
> Hi,
> 
> that's good :)
> 
> SDL does it by compiling the mmx stuff in if the compiler supports it.
> Then it has runtime checks to see if the cpu supports it.  SDL also
> has a configure flag, which you can use to tell it not to even try
> compiling mmx stuff.
> 
> So if the compiler doesn't support it, the C version is used.
> If the compiler supports it, and the runtime cpu detection finds mmx,
> then the mmx version is used.
> 
> 
> 
> On 6/20/07, Richard Goedeken <[EMAIL PROTECTED]> wrote:
>> Rene,
>>
>> I would be willing to add the CPU detection functions but I can't think
>> of how it could be implemented in a useful way.  The compile-time checks
>> have to stay in because trying to compile the 64-bit code on a 32-bit
>> architecture, or the 32-bit MMX code on a PPC or similar, will cause
>> compile-time errors.  So it's given that if someone is running an i386,
>> PPC, Sun, Arm, etc, they will get the C code.  If they're running i686,
>> they'll get the 32-bit MMX, and for x86_64 they'll get the 64-bit MMX.
>>
>> So, the dilemma is that whatever build a user is running, the code is
>> pretty much guaranteed to work on their CPU.  If someone is running the
>> i686 build on a 486 or something silly like that they'll probably have
>> bigger problems.  We could allow someone to 'downgrade' and run the C
>> code when their CPU supports MMX, but what's the point?
>>
>> Regards,
>> Richard


Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)

2007-06-19 Thread René Dudfield

Hi,

that's good :)

SDL does it by compiling the mmx stuff in if the compiler supports it.
Then it has runtime checks to see if the cpu supports it.  SDL also
has a configure flag, which you can use to tell it not to even try
compiling mmx stuff.

So if the compiler doesn't support it, the C version is used.
If the compiler supports it, and the runtime cpu detection finds mmx,
then the mmx version is used.



On 6/20/07, Richard Goedeken <[EMAIL PROTECTED]> wrote:

Rene,

I would be willing to add the CPU detection functions but I can't think
of how it could be implemented in a useful way.  The compile-time checks
have to stay in because trying to compile the 64-bit code on a 32-bit
architecture, or the 32-bit MMX code on a PPC or similar, will cause
compile-time errors.  So it's given that if someone is running an i386,
PPC, Sun, Arm, etc, they will get the C code.  If they're running i686,
they'll get the 32-bit MMX, and for x86_64 they'll get the 64-bit MMX.

So, the dilemma is that whatever build a user is running, the code is
pretty much guaranteed to work on their CPU.  If someone is running the
i686 build on a 486 or something silly like that they'll probably have
bigger problems.  We could allow someone to 'downgrade' and run the C
code when their CPU supports MMX, but what's the point?

Regards,
Richard



René Dudfield wrote:
> Nice one!
>
> This sounds like a very nice scaling function.
>
> It'd be cool if we could include a run time way of including mmx and
> other cpu specific optimizations.  Probably using the SDL methods would
> be the way to go.
>
> I've added it to the todo list for this weeks mini sprint.
> http://www.pygame.org/wiki/todo  So hopefully it'll get into pygame soon.
>
> If you feel like figuring out how to use the SDL mmx detection routines
> to select the mmx routine at runtime, that'd be cool.
>
>
> On 6/18/07, *Richard Goedeken* <[EMAIL PROTECTED]
> > wrote:
>
> Hello everyone.  I just joined the list; My name is Richard Goedeken.
> I'm using Pygame in a project that I've been working on for a few weeks,
> and I wanted an image scaling function with higher visual quality than
> the nearest-neighbor algorithm which is included with the 'scale'
> function.  So I wrote one; it's in the attached zip file. I hereby give
> the Pygame maintainers permission to include and distribute this code
> with the Pygame project under the license of their choice.
>
> The algorithm which I've implemented is interesting.  Each axis is
> scaled independently, which gives it the property that scaling an image
> only in the X dimension or only in the Y dimension will be about twice
> as fast as scaling both.  The reason that this design was chosen is
> because the axes are scaled differently depending upon whether they are
> being shrunk or expanded.  For expansion, a bilinear filter is used
> which looks nice at magnifications under 3x or so and is quick.  For
> shrinking the image, a novel area-averaging algorithm is used which
> suppresses Moire patterns and looks good even at very small sizes.
>
> The source code is in transform.c.  It's pretty big because I've also
> included inline MMX routines for the i686 and x86_64 architectures under
> Unix.  The AT&T-style asm sytax won't work with the Intel or MS
> compilers, but someone could translate it and add Intel-style code for
> Win32.  It runs a lot faster with the MMX code.  I have included a test
> program (scaletest.py) which can run a short benchmark series of scaling
> operations.  When run with a 600k pixel image, I got the following
> results:
>
> Machine AlgorithmCode level   Shrink time   Expand time
> Athlon64 3800+  smoothscale  C-only   36 ms 96 ms
> Athlon64 3800+  smoothscale  64-bit MMX   5 ms  16 ms
> Athlon64 3800+  scaleC-only   2 ms  13 ms
> Pentium 3-800   smoothscale  C-only   64 ms 180 ms
> Pentium 3-800   smoothscale  32-bit MMX   39 ms 119 ms
> Pentium 3-800   scaleC-only   17 ms 85 ms
>
> I was surprised that the MMX ran so much (6x) faster than the C-code on
> my 64-bit machine.  But I'm happy that it actually comes close to
> matching the nearest-neighbor 'scale' function.  I think the P-3 may
> have been hindered by relatively low memory bandwidth.  With newer
> 32-bit architectures such as the Core 2 or Athlon I believe that the MMX
> will give a bigger speed gain over the C than the P-3.
>
> The 'config.py' file is also modified to set CFLAGS to activate the
> inline assembly code.  I've integrated this new function into my project
> system, and it's quite a nice visual upgrade.  I'm sure there are a lot
> of people who could use a relatively fast smooth scaling algorithm in
> the pygame software, so enjoy!
>
> Richard
>
>
>

Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)

2007-06-19 Thread Richard Goedeken
Rene,

I would be willing to add the CPU detection functions but I can't think
of how it could be implemented in a useful way.  The compile-time checks
have to stay in because trying to compile the 64-bit code on a 32-bit
architecture, or the 32-bit MMX code on a PPC or similar, will cause
compile-time errors.  So it's given that if someone is running an i386,
PPC, Sun, Arm, etc, they will get the C code.  If they're running i686,
they'll get the 32-bit MMX, and for x86_64 they'll get the 64-bit MMX.

So, the dilemma is that whatever build a user is running, the code is
pretty much guaranteed to work on their CPU.  If someone is running the
i686 build on a 486 or something silly like that they'll probably have
bigger problems.  We could allow someone to 'downgrade' and run the C
code when their CPU supports MMX, but what's the point?

Regards,
Richard



René Dudfield wrote:
> Nice one!
> 
> This sounds like a very nice scaling function.
> 
> It'd be cool if we could include a run time way of including mmx and
> other cpu specific optimizations.  Probably using the SDL methods would
> be the way to go.
> 
> I've added it to the todo list for this weeks mini sprint. 
> http://www.pygame.org/wiki/todo  So hopefully it'll get into pygame soon.
> 
> If you feel like figuring out how to use the SDL mmx detection routines
> to select the mmx routine at runtime, that'd be cool.
> 
> 
> On 6/18/07, *Richard Goedeken* <[EMAIL PROTECTED]
> > wrote:
> 
> Hello everyone.  I just joined the list; My name is Richard Goedeken.
> I'm using Pygame in a project that I've been working on for a few weeks,
> and I wanted an image scaling function with higher visual quality than
> the nearest-neighbor algorithm which is included with the 'scale'
> function.  So I wrote one; it's in the attached zip file. I hereby give
> the Pygame maintainers permission to include and distribute this code
> with the Pygame project under the license of their choice.
> 
> The algorithm which I've implemented is interesting.  Each axis is
> scaled independently, which gives it the property that scaling an image
> only in the X dimension or only in the Y dimension will be about twice
> as fast as scaling both.  The reason that this design was chosen is
> because the axes are scaled differently depending upon whether they are
> being shrunk or expanded.  For expansion, a bilinear filter is used
> which looks nice at magnifications under 3x or so and is quick.  For
> shrinking the image, a novel area-averaging algorithm is used which
> suppresses Moire patterns and looks good even at very small sizes.
> 
> The source code is in transform.c.  It's pretty big because I've also
> included inline MMX routines for the i686 and x86_64 architectures under
> Unix.  The AT&T-style asm sytax won't work with the Intel or MS
> compilers, but someone could translate it and add Intel-style code for
> Win32.  It runs a lot faster with the MMX code.  I have included a test
> program (scaletest.py) which can run a short benchmark series of scaling
> operations.  When run with a 600k pixel image, I got the following
> results:
> 
> Machine AlgorithmCode level   Shrink time   Expand time
> Athlon64 3800+  smoothscale  C-only   36 ms 96 ms
> Athlon64 3800+  smoothscale  64-bit MMX   5 ms  16 ms
> Athlon64 3800+  scaleC-only   2 ms  13 ms
> Pentium 3-800   smoothscale  C-only   64 ms 180 ms
> Pentium 3-800   smoothscale  32-bit MMX   39 ms 119 ms
> Pentium 3-800   scaleC-only   17 ms 85 ms
> 
> I was surprised that the MMX ran so much (6x) faster than the C-code on
> my 64-bit machine.  But I'm happy that it actually comes close to
> matching the nearest-neighbor 'scale' function.  I think the P-3 may
> have been hindered by relatively low memory bandwidth.  With newer
> 32-bit architectures such as the Core 2 or Athlon I believe that the MMX
> will give a bigger speed gain over the C than the P-3.
> 
> The 'config.py' file is also modified to set CFLAGS to activate the
> inline assembly code.  I've integrated this new function into my project
> system, and it's quite a nice visual upgrade.  I'm sure there are a lot
> of people who could use a relatively fast smooth scaling algorithm in
> the pygame software, so enjoy!
> 
> Richard
> 
> 
> 


Re: [PATCH] Re: [pygame] Smooth Scaling (pygame.transform.smoothscale)

2007-06-17 Thread René Dudfield

More information on the SDL cpu detection stuff.

See the code in the sdl source:
include/SDL_cpuinfo.h
src/video/blit.c
src/video/mmx.h



On 6/18/07, René Dudfield <[EMAIL PROTECTED]> wrote:

Nice one!

This sounds like a very nice scaling function.

It'd be cool if we could include a run time way of including mmx and other cpu 
specific optimizations.  Probably using the SDL methods would be the way to go.

I've added it to the todo list for this weeks mini sprint.  
http://www.pygame.org/wiki/todo  So hopefully it'll get into pygame soon.

If you feel like figuring out how to use the SDL mmx detection routines to 
select the mmx routine at runtime, that'd be cool.




On 6/18/07, Richard Goedeken <[EMAIL PROTECTED]> wrote:
>  Hello everyone.  I just joined the list; My name is Richard Goedeken.
> I'm using Pygame in a project that I've been working on for a few weeks,
> and I wanted an image scaling function with higher visual quality than
> the nearest-neighbor algorithm which is included with the 'scale'
> function.  So I wrote one; it's in the attached zip file. I hereby give
> the Pygame maintainers permission to include and distribute this code
> with the Pygame project under the license of their choice.
>
> The algorithm which I've implemented is interesting.  Each axis is
> scaled independently, which gives it the property that scaling an image
> only in the X dimension or only in the Y dimension will be about twice
> as fast as scaling both.  The reason that this design was chosen is
> because the axes are scaled differently depending upon whether they are
> being shrunk or expanded.  For expansion, a bilinear filter is used
>  which looks nice at magnifications under 3x or so and is quick.  For
> shrinking the image, a novel area-averaging algorithm is used which
> suppresses Moire patterns and looks good even at very small sizes.
>
> The source code is in  transform.c.  It's pretty big because I've also
> included inline MMX routines for the i686 and x86_64 architectures under
> Unix.  The AT&T-style asm sytax won't work with the Intel or MS
> compilers, but someone could translate it and add Intel-style code for
> Win32.  It runs a lot faster with the MMX code.  I have included a test
> program (scaletest.py) which can run a short benchmark series of scaling
> operations.  When run with a 600k pixel image, I got the following results:
>
> Machine AlgorithmCode level   Shrink time   Expand time
> Athlon64 3800+  smoothscale  C-only   36 ms 96 ms
> Athlon64 3800+  smoothscale  64-bit MMX   5 ms  16 ms
> Athlon64 3800+  scaleC-only   2 ms  13 ms
> Pentium 3-800   smoothscale  C-only   64 ms 180 ms
> Pentium 3-800   smoothscale  32-bit MMX   39 ms 119 ms
> Pentium 3-800   scaleC-only   17 ms 85 ms
>
> I was surprised that the MMX ran so much (6x) faster than the C-code on
> my 64-bit machine.  But I'm happy that it actually comes close to
> matching the nearest-neighbor 'scale' function.  I think the P-3 may
> have been hindered by relatively low memory bandwidth.  With newer
> 32-bit architectures such as the Core 2 or Athlon I believe that the MMX
> will give a bigger speed gain over the C than the P-3.
>
> The 'config.py' file is also modified to set CFLAGS to activate the
>  inline assembly code.  I've integrated this new function into my project
> system, and it's quite a nice visual upgrade.  I'm sure there are a lot
> of people who could use a relatively fast smooth scaling algorithm in
> the pygame software, so enjoy!
>
> Richard
>
>
>