Re: [fpc-devel] Extended type

2011-04-19 Thread Alexander Klenin
2011/4/19 Nikolai Zhubr n-a-zh...@yandex.ru:
 Now, with the
 introduction of 64-bit processors IIRC AMD took care of this problem by
 providing some means to execute floating-point operations without the need
 for traditional FPU register space, thus allowing to avoid the need to
 save/restore FPU state. IIRC these are some _new_ opcodes, unavailable on
 earlier CPUs.

Very interesting -- can you provide further detail on this?
I could not find anything relevant neither in vol.1 ch.6 nor vol.5 ch.2 of
AMD's APM -- is there something I overlooked?


-- 
Alexander S. Klenin
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Jonas Maebe


On 19 Apr 2011, at 11:43, Alexander Klenin wrote:


2011/4/19 Nikolai Zhubr n-a-zh...@yandex.ru:

Now, with the
introduction of 64-bit processors IIRC AMD took care of this  
problem by
providing some means to execute floating-point operations without  
the need
for traditional FPU register space, thus allowing to avoid the need  
to
save/restore FPU state. IIRC these are some _new_ opcodes,  
unavailable on

earlier CPUs.


Very interesting -- can you provide further detail on this?
I could not find anything relevant neither in vol.1 ch.6 nor vol.5  
ch.2 of

AMD's APM -- is there something I overlooked?


There are no really new instructions for floating point. However,  
x86-64 mandates at least SSE2 (while x86 does not), which in turn  
supports 64 bit floating point math.



Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Nikolai Zhubr

19.04.2011 13:43, Alexander Klenin:

2011/4/19 Nikolai Zhubrn-a-zh...@yandex.ru:

Now, with the
introduction of 64-bit processors IIRC AMD took care of this problem by
providing some means to execute floating-point operations without the need
for traditional FPU register space, thus allowing to avoid the need to
save/restore FPU state. IIRC these are some _new_ opcodes, unavailable on
earlier CPUs.


Very interesting -- can you provide further detail on this?
I could not find anything relevant neither in vol.1 ch.6 nor vol.5 ch.2 of
AMD's APM -- is there something I overlooked?


Sorry, I looked into it several years ago, I don't have any links by 
hand anymore.
However, Jonas seem to be more exact on this. I think he is right and 
AMD just pushed deprecation of x87 in favour of SSE(2).


Nikolai





___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Florian Klämpfl
Am 19.04.2011 12:12, schrieb Daniël Mantione:
 
 
 Op Tue, 19 Apr 2011, schreef Nikolai Zhubr:
 
 ms (supposedly) decided to just not preserve FPU/MMX state between
 64-bit processes.
 
 MS does preserve FPU states between processes. You can use the x87 on
 Windows, nothing prevents you from doing so. Maybe the calling
 convention, but even that you can extend with x87.

FPC still uses the x87 FPU for trig. functions on Win64.

 
 It's just that the documentation tells you not to use the x87.

Yes, because it's strange programming model should be really dropped.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Nikolai Zhubr

19.04.2011 14:12, Daniël Mantione:


MS does preserve FPU states between processes. You can use the x87 on
Windows, nothing prevents you from doing so. Maybe the calling


Yes it does for 32-bit processes on win64, guaranteed.
But do you have any evidence (tests/documents/links) proving it also 
does so for 64-bit processes on win64?



convention, but even that you can extend with x87.

It's just that the documentation tells you not to use the x87.

Daniël



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Hans-Peter Diettrich

Nikolai Zhubr schrieb:


Originally MS spread info it wouldn't work at all under Windows, but
that proved to be false,
the FPU works technically. Now MS just states it is unsupported.


And deprecated:
http://msdn.microsoft.com/en-us/library/ee418798(VS.85).aspx#Porting_to_64bit 





Thanks. I always knew that Windows is not an OS for serious work, but I
never heard that from Microsoft so clearly :-(


Not being an ms fan whatsoever, but you all seem to have missed the 
technical point here.


Because x87 (and also MMX in some sense) is a coprocessor (and has its 
own register space) its full state has to be saved/restored (by an OS) 
between different running processes in case any process might use 
fpu/mmx.


The same applies to the XMM/YMM registers. While dropping MMX support is 
acceptable, in favor of the new vector arithmetic instruction set, I see 
no point in dropping 80 bit reals before a new 128 bit arithmetic 
becomes available.


Clearly this may become rather inefficient performance-wise 
(because, well, an application might just want to use 2 fpu registers at 
a time, and OS will still have to store the whole bunch all the time...) 
Now, with the introduction of 64-bit processors IIRC AMD took care of 
this problem by providing some means to execute floating-point 
operations without the need for traditional FPU register space, thus 
allowing to avoid the need to save/restore FPU state. IIRC these are 
some _new_ opcodes, unavailable on earlier CPUs.


When AMD aliased the FPU and MMX registers, I don't understand why they 
*added* new XMM registers, instead of extending the already existing MMX 
registers - just for fast switching. But it is as it is...


So, for performance reasons, and because 64-bit applications (are now 
supposed to be) able to do all floating-point without touching the 
traditional FPU, ms (supposedly) decided to just not preserve FPU/MMX 
state between 64-bit processes. Thats all. IMHO is makes some sense 
actually, though it would be much nicer if there was some option to 
select this deliberately (say at boot time or whatever).


At least an application should have a chance to specify, which register 
sets have to be saved on an task switch. Unless stated otherwise by MS, 
the entire state should be saved, as long as x87/MMX is only deprecated, 
not dropped. Any official information on this issue?


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Daniël Mantione



Op Tue, 19 Apr 2011, schreef Nikolai Zhubr:


19.04.2011 14:12, Daniël Mantione:


MS does preserve FPU states between processes. You can use the x87 on
Windows, nothing prevents you from doing so. Maybe the calling


Yes it does for 32-bit processes on win64, guaranteed.
But do you have any evidence (tests/documents/links) proving it also does so 
for 64-bit processes on win64?


Not at hand, but don't worry, it does preserve FPU states.

Daniël___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Daniël Mantione



Op Tue, 19 Apr 2011, schreef Florian Klämpfl:


It's just that the documentation tells you not to use the x87.


Yes, because it's strange programming model should be really dropped.


Agree, but the 80 bit support makes some people want to use it. And that 
will stay this way until CPU manufacturers invent an alternative.


By the way, recent GCC versions calculate the goniometric functions in 
software using SSE3, and I checked that this is indeed slightly faster 
than the x87. So we can get rid to the x87 stuff, should we want.


Daniël___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Florian Klaempfl

Am 19.04.2011 12:27, schrieb Daniël Mantione:



Op Tue, 19 Apr 2011, schreef Florian Klämpfl:


It's just that the documentation tells you not to use the x87.


Yes, because it's strange programming model should be really dropped.


Agree, but the 80 bit support makes some people want to use it. And that
will stay this way until CPU manufacturers invent an alternative.


Using extended typically hides only bad numerical algorithms. There 
might be some corner cases where extended is usefull but I general I 
think it's a matter of bad algorithms.




By the way, recent GCC versions calculate the goniometric functions in
software using SSE3, and I checked that this is indeed slightly faster
than the x87.


I know but as usual, time etc ;)
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Marco van de Voort
In our previous episode, Dani?l Mantione said:
 
 By the way, recent GCC versions calculate the goniometric functions in 
 software using SSE3, and I checked that this is indeed slightly faster 
 than the x87. So we can get rid to the x87 stuff, should we want.

You'll need to runtime test for SSE3 though. Since the first generation of
athlon64's (clawhammer and friends, socket 751 or so) doesn't have SSE3.

I checked and 64-bit Pentium-D's do have SSE3, at least mine does:

CPU:
Intel(R) Pentium(R) D CPU 2.80GHz (2793.02-MHz K8-class CPU)
  Origin = GenuineIntel  Id = 0xf47  Family = f  Model = 4  Stepping = 7

Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x641dSSE3,DTES64,MON,DS_CPL,CNXT-ID,CX16,xTPR
  AMD Features=0x20100800SYSCALL,NX,LM
  AMD Features2=0x1LAHF
  TSC: P-state invariant
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Florian Klaempfl

Am 19.04.2011 15:18, schrieb Marco van de Voort:

In our previous episode, Dani?l Mantione said:


By the way, recent GCC versions calculate the goniometric functions in
software using SSE3, and I checked that this is indeed slightly faster
than the x87. So we can get rid to the x87 stuff, should we want.


You'll need to runtime test for SSE3 though. Since the first generation of
athlon64's (clawhammer and friends, socket 751 or so) doesn't have SSE3.


For such a relatively expensive operations, one runtime check per 
function is imo ok even more since it is predicted perfectly after the 
first run.




I checked and 64-bit Pentium-D's do have SSE3, at least mine does:

CPU:
Intel(R) Pentium(R) D CPU 2.80GHz (2793.02-MHz K8-class CPU)
   Origin = GenuineIntel  Id = 0xf47  Family = f  Model = 4  Stepping = 7

Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
   Features2=0x641dSSE3,DTES64,MON,DS_CPL,CNXT-ID,CX16,xTPR
   AMD Features=0x20100800SYSCALL,NX,LM
   AMD Features2=0x1LAHF
   TSC: P-state invariant
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Extended type

2011-04-19 Thread Hans-Peter Diettrich

Florian Klaempfl schrieb:

Using extended typically hides only bad numerical algorithms. There 
might be some corner cases where extended is usefull but I general I 
think it's a matter of bad algorithms.


Some algorithms convert faster with increased accuracy.

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel