Re: [fpc-devel] SSE in FPC
Alain Michaud wrote: Vinzent Höfler wrote: Alain Michaud wrote: Hi, Not exactly related to this thread, but worth mentioning: Some time ago I was interested in (numerical) computing some Bessel function to the highest precision! I looked at MMX, SSE, SSE2, SSE3, SSE4, FP87, 3D-something etc... And all that jazz... It turns out that only the "old" FP87 (FPU) has the 80 bit (extended) floating format! Anything else, and you are limited to the 64 bit (double) floating numbers. I fell sorry to see that even 25 years after the original 8087, modern CPUs are not even capable to have the same precision! If you need precision you have to do integer math anyway. Floating point numbers are always crude approximations, no matter how many bits they have. Considering that the numerical range of an 80-bit floating point number is larger than the relation between the diameter of an atom and the diameter of the known universe, this so-called "precision" is hardly ever needed. Or to express this a bit (pun intended) differently: How much of the universe do you expect to fit into 80 bits? Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel I think that high precision numerical computing is very useful in many fields of science and technology: The Rydberg constant is defined using 13 digits. The second is measured using 15 digits Some laser spectroscopist make measurements using 18 digits. The Lamb shift in hydrogen has been measured using 13 digits. Right now this measurement is more accurate than the theoretical evaluation: In order to evaluate this number (to 12 digits), the researchers had to make a long calculation involving the sum of about 2000 numerical integrals. You are right, they have used symbolic computation. However, in order to make sure that the did not miss someting, they have also checked every integral by performing a numerical integration a test! Therefore, I think that 64 bits is a limit ! Alain You are right, but scientific math libraries are for that reason alwas based on large integer calculations. SSE, MMX etc are designed for multimedia applications, not science. However, there are dedicated units available that implement large scaled integers in hardware (really fpga's I believe, but not sure) to speed up scientific processing (at a - very high - price, but at least up to 1024 bit that I know of) smime.p7s Description: S/MIME Cryptographic Signature ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Vinzent Höfler wrote: Alain Michaud wrote: Hi, Not exactly related to this thread, but worth mentioning: Some time ago I was interested in (numerical) computing some Bessel function to the highest precision! I looked at MMX, SSE, SSE2, SSE3, SSE4, FP87, 3D-something etc... And all that jazz... It turns out that only the "old" FP87 (FPU) has the 80 bit (extended) floating format! Anything else, and you are limited to the 64 bit (double) floating numbers. I fell sorry to see that even 25 years after the original 8087, modern CPUs are not even capable to have the same precision! If you need precision you have to do integer math anyway. Floating point numbers are always crude approximations, no matter how many bits they have. Considering that the numerical range of an 80-bit floating point number is larger than the relation between the diameter of an atom and the diameter of the known universe, this so-called "precision" is hardly ever needed. Or to express this a bit (pun intended) differently: How much of the universe do you expect to fit into 80 bits? Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel I think that high precision numerical computing is very useful in many fields of science and technology: The Rydberg constant is defined using 13 digits. The second is measured using 15 digits Some laser spectroscopist make measurements using 18 digits. The Lamb shift in hydrogen has been measured using 13 digits. Right now this measurement is more accurate than the theoretical evaluation: In order to evaluate this number (to 12 digits), the researchers had to make a long calculation involving the sum of about 2000 numerical integrals. You are right, they have used symbolic computation. However, in order to make sure that the did not miss someting, they have also checked every integral by performing a numerical integration a test! Therefore, I think that 64 bits is a limit ! Alain ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Alain Michaud wrote: Hi, Not exactly related to this thread, but worth mentioning: Some time ago I was interested in (numerical) computing some Bessel function to the highest precision! I looked at MMX, SSE, SSE2, SSE3, SSE4, FP87, 3D-something etc... And all that jazz... It turns out that only the "old" FP87 (FPU) has the 80 bit (extended) floating format! Anything else, and you are limited to the 64 bit (double) floating numbers. I fell sorry to see that even 25 years after the original 8087, modern CPUs are not even capable to have the same precision! If you need precision you have to do integer math anyway. Floating point numbers are always crude approximations, no matter how many bits they have. Considering that the numerical range of an 80-bit floating point number is larger than the relation between the diameter of an atom and the diameter of the known universe, this so-called "precision" is hardly ever needed. Or to express this a bit (pun intended) differently: How much of the universe do you expect to fit into 80 bits? Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Hi, Not exactly related to this thread, but worth mentioning: Some time ago I was interested in (numerical) computing some Bessel function to the highest precision! I looked at MMX, SSE, SSE2, SSE3, SSE4, FP87, 3D-something etc... And all that jazz... It turns out that only the "old" FP87 (FPU) has the 80 bit (extended) floating format! Anything else, and you are limited to the 64 bit (double) floating numbers. I fell sorry to see that even 25 years after the original 8087, modern CPUs are not even capable to have the same precision! Alain Dariusz Mazur wrote: Mattias Gaertner pisze: On Fri, 28 Nov 2008 21:25:16 +0100 darekm <[EMAIL PROTECTED]> wrote: Hi Are in FPC some instruction set contains streaming SIMD (SSE) like in GCC: http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/X86-Built_002din-Functions.html or in Microsoft Visual http://msdn.microsoft.com/en-us/library/kcwz153a(VS.80).aspx Use the mmx unit to discover what mmx/sse instruction set is supported by your cpu. You can use mmx/sse commands directly in the asm blocks and FPC automatically uses MMX/SSE instructions for your pascal code if you allow it (specify CPU type). In my experiments FPC often created faster code itself. So don't expect much speed gain when using SSE instructions directly. MMX can operate on 2 longint simultaneous and SSE on 4, FPC often sometime make optimization to SSE, thus is faster. I don't want write SSE instruction by hand, but would by nice to to tell to compiler, where those optimization should be made (if its possible) Darek ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Florian Klaempfl pisze: Dariusz Mazur schrieb: Florian Klaempfl pisze: Dariusz Mazur schrieb: Jonas Maebe pisze: Of course tSSEVector should be declared in System unit. Then any one can use SSE intentionally Why can't you now? It's not like multiplication has any other meaning for arrays. And declaring "magic compiler types" in the system unit is something that should be avoided as much as possible (it makes both the compiler and rtl harder to adapt and understand). Of course, but SIMD is thing, which has more and more impact to performance. And compiler should respect it (and do this, as Florian said). But till now nobody know, where compiler use SSE instruction. Of course one knows. For array operations as shown SSE/SSE2 is used if enabled. ___ Ok. I only want to discover how to use it. if I write type tSSEVector = array[0..1] of double var d1 : tSSeVector; d2 : array[0..1]of tSSEVector; begin d1:=d2[0]*d2[1]; end; will be work too? Yes. Thanks. Now I understand. Or something like this: function f ; var d1,d2,d3 : array[0..4] of integer; begin d1:=d2*d3; end; No. Not yet implemented iirc. But will be in the same manner? -- Darek ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Dariusz Mazur schrieb: > Florian Klaempfl pisze: >> Dariusz Mazur schrieb: >> >>> Jonas Maebe pisze: >>> > Of course tSSEVector should be declared in System unit. > Then any one can use SSE intentionally > Why can't you now? It's not like multiplication has any other meaning for arrays. And declaring "magic compiler types" in the system unit is something that should be avoided as much as possible (it makes both the compiler and rtl harder to adapt and understand). >>> Of course, but SIMD is thing, which has more and more impact to >>> performance. And compiler should respect it (and do this, as Florian >>> said). But till now nobody know, where compiler use SSE instruction. >> >> Of course one knows. For array operations as shown SSE/SSE2 is used if >> enabled. >> ___ >> > Ok. I only want to discover how to use it. > if I write > > type > tSSEVector = array[0..1] of double > var > d1 : tSSeVector; > d2 : array[0..1]of tSSEVector; > begin > d1:=d2[0]*d2[1]; > end; > > will be work too? Yes. > > Or something like this: > > function f ; > > var > > d1,d2,d3 : array[0..4] of integer; > begin > > d1:=d2*d3; > > end; No. Not yet implemented iirc. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Florian Klaempfl pisze: Dariusz Mazur schrieb: Jonas Maebe pisze: Of course tSSEVector should be declared in System unit. Then any one can use SSE intentionally Why can't you now? It's not like multiplication has any other meaning for arrays. And declaring "magic compiler types" in the system unit is something that should be avoided as much as possible (it makes both the compiler and rtl harder to adapt and understand). Of course, but SIMD is thing, which has more and more impact to performance. And compiler should respect it (and do this, as Florian said). But till now nobody know, where compiler use SSE instruction. Of course one knows. For array operations as shown SSE/SSE2 is used if enabled. ___ Ok. I only want to discover how to use it. if I write type tSSEVector = array[0..1] of double var d1 : tSSeVector; d2 : array[0..1]of tSSEVector; begin d1:=d2[0]*d2[1]; end; will be work too? Or something like this: function f ; var d1,d2,d3 : array[0..4] of integer; begin d1:=d2*d3; end; -- Darek ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Dariusz Mazur schrieb: > Jonas Maebe pisze: >>> >>> >>> Of course tSSEVector should be declared in System unit. >>> Then any one can use SSE intentionally >> >> Why can't you now? It's not like multiplication has any other meaning >> for arrays. And declaring "magic compiler types" in the system unit is >> something that should be avoided as much as possible (it makes both >> the compiler and rtl harder to adapt and understand). > Of course, but SIMD is thing, which has more and more impact to > performance. And compiler should respect it (and do this, as Florian > said). But till now nobody know, where compiler use SSE instruction. Of course one knows. For array operations as shown SSE/SSE2 is used if enabled. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Jonas Maebe pisze: Of course tSSEVector should be declared in System unit. Then any one can use SSE intentionally Why can't you now? It's not like multiplication has any other meaning for arrays. And declaring "magic compiler types" in the system unit is something that should be avoided as much as possible (it makes both the compiler and rtl harder to adapt and understand). Of course, but SIMD is thing, which has more and more impact to performance. And compiler should respect it (and do this, as Florian said). But till now nobody know, where compiler use SSE instruction. In my mind, to achieve this is declare some primitives represents SSE types. On one side there are "magic compiler types" but on second there exists in real world (most of used CPU has them). And it will be harmless, because its not change in language (we can use record, array) and is easy implement needed primitives in pure pascal. Its very similar to IEEE-754: on platform where FPU not exists all function are implemented by hand. -- Darek ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Mattias Gaertner pisze: On Fri, 28 Nov 2008 21:25:16 +0100 darekm <[EMAIL PROTECTED]> wrote: Hi Are in FPC some instruction set contains streaming SIMD (SSE) like in GCC: http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/X86-Built_002din-Functions.html or in Microsoft Visual http://msdn.microsoft.com/en-us/library/kcwz153a(VS.80).aspx Use the mmx unit to discover what mmx/sse instruction set is supported by your cpu. You can use mmx/sse commands directly in the asm blocks and FPC automatically uses MMX/SSE instructions for your pascal code if you allow it (specify CPU type). In my experiments FPC often created faster code itself. So don't expect much speed gain when using SSE instructions directly. MMX can operate on 2 longint simultaneous and SSE on 4, FPC often sometime make optimization to SSE, thus is faster. I don't want write SSE instruction by hand, but would by nice to to tell to compiler, where those optimization should be made (if its possible) Darek ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
On 29 Nov 2008, at 18:15, Dariusz Mazur wrote: Florian Klaempfl pisze: function f : double; var d1,d2,d3 : array[0..1] of double; begin d1:=d2*d3; end; I would expect some thing like this type tSSEvector= packed record of a,b : double; end; { or } tSSEvector= array[0..1] of double; function f : double; var d1,d2,d3 : tSSEVector; begin d1:=d2*d3; end; Of course tSSEVector should be declared in System unit. Then any one can use SSE intentionally Why can't you now? It's not like multiplication has any other meaning for arrays. And declaring "magic compiler types" in the system unit is something that should be avoided as much as possible (it makes both the compiler and rtl harder to adapt and understand). Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Florian Klaempfl pisze: Jonas Maebe schrieb: On 29 Nov 2008, at 11:11, Felipe Monteiro de Carvalho wrote: You can tell FPC to do the SSE code for you: -Cf Select fpu instruction set to use, see fpc -i for possible values That only applies to (scalar) FPU operations at this time. It won't do any (auto or other) vectorisation. With -Sv -Cfsse2 you can compile things like I know this, but its hard to discover where and when is is used. Often is good, when compiler make optimization itself, but some times is better tell them about possible vectorization. function f : double; var d1,d2,d3 : array[0..1] of double; begin d1:=d2*d3; end; I would expect some thing like this type tSSEvector= packed record of a,b : double; end; { or } tSSEvector= array[0..1] of double; function f : double; var d1,d2,d3 : tSSEVector; begin d1:=d2*d3; end; Of course tSSEVector should be declared in System unit. Then any one can use SSE intentionally. However, it's not well tested. Is there a list of SSE function, which can FPC use? Darek ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
Jonas Maebe schrieb: > > On 29 Nov 2008, at 11:11, Felipe Monteiro de Carvalho wrote: > >> You can tell FPC to do the SSE code for you: >> >> -Cf Select fpu instruction set to use, see fpc -i for >> possible values > > That only applies to (scalar) FPU operations at this time. It won't do > any (auto or other) vectorisation. With -Sv -Cfsse2 you can compile things like function f : double; var d1,d2,d3 : array[0..1] of double; begin d1:=d2*d3; end; However, it's not well tested. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
On 29 Nov 2008, at 11:11, Felipe Monteiro de Carvalho wrote: You can tell FPC to do the SSE code for you: -Cf Select fpu instruction set to use, see fpc -i for possible values That only applies to (scalar) FPU operations at this time. It won't do any (auto or other) vectorisation. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
You can tell FPC to do the SSE code for you: -Cf Select fpu instruction set to use, see fpc -i for possible values And in fpc - i I see: Supported CPU instruction sets: 386 PENTIUM PENTIUM2 PENTIUM3 PENTIUM4 PENTIUMM Supported FPU instruction sets: X87 SSE SSE2 SSE3 So I would use something like: -CfSSE3 maybe also -CfPENTIUM4 -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] SSE in FPC
On Fri, 28 Nov 2008 21:25:16 +0100 darekm <[EMAIL PROTECTED]> wrote: > Hi > > Are in FPC some instruction set contains streaming SIMD (SSE) like > in GCC: > > http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/X86-Built_002din-Functions.html > > or in Microsoft Visual > > http://msdn.microsoft.com/en-us/library/kcwz153a(VS.80).aspx Use the mmx unit to discover what mmx/sse instruction set is supported by your cpu. You can use mmx/sse commands directly in the asm blocks and FPC automatically uses MMX/SSE instructions for your pascal code if you allow it (specify CPU type). In my experiments FPC often created faster code itself. So don't expect much speed gain when using SSE instructions directly. Mattias ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
[fpc-devel] SSE in FPC
Hi Are in FPC some instruction set contains streaming SIMD (SSE) like in GCC: http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/X86-Built_002din-Functions.html or in Microsoft Visual http://msdn.microsoft.com/en-us/library/kcwz153a(VS.80).aspx -- Darek ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel