Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
On Mon, Jul 2, 2012 at 11:25 AM, Catalin Patulea catal...@google.com wrote: FPREM1 now passes the TestFloat floatx80_rem suite (and FPREM is implemented very similarly). The code (the bulk of which is remainder_kernel and do_fprem) is derived from Bochs SVN revision 11224 dated 2012-06-21 10:33:37 -0700, with conversions to Qemu type aliases, C features only, etc. as needed. Signed-off-by: Catalin Patulea catal...@google.com --- fpu/softfloat.c | 195 +++ fpu/softfloat.h |4 + target-i386/op_helper.c | 166 3 files changed, 266 insertions(+), 99 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index b29256a..bd1879d 100644 [...] Ping - how do people feel about the latest patch?
Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
On 2 July 2012 16:25, Catalin Patulea catal...@google.com wrote: FPREM1 now passes the TestFloat floatx80_rem suite (and FPREM is implemented very similarly). The code (the bulk of which is remainder_kernel and do_fprem) is derived from Bochs SVN revision 11224 dated 2012-06-21 10:33:37 -0700, with conversions to Qemu type aliases, C features only, etc. as needed. QEMU is the official capitalization. Signed-off-by: Catalin Patulea catal...@google.com --- fpu/softfloat.c | 195 +++ fpu/softfloat.h |4 + target-i386/op_helper.c | 166 3 files changed, 266 insertions(+), 99 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index b29256a..bd1879d 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -5234,6 +5234,16 @@ int floatx80_unordered_quiet( floatx80 a, floatx80 b STATUS_PARAM ) } /* +| Returns 1 if the extended double-precision floating-point value `a' is an +| unsupported value; otherwise returns 0. Let's try for something a little less cryptic to save future readers having to dig out the intel architecture manuals: an unsupported value (ie the bit pattern does not represent a valid IEEE number). +**/ +int floatx80_is_unsupported(floatx80 a) +{ +return extractFloatx80Exp(a) + !(extractFloatx80Frac(a) LIT64(0x8000)); +} This doesn't match up with all the cases in the Intel Software Developer's Manual table 8.3: it catches exponent non-zero but explicit integer bit is zero (pseudo-NaNs, pseudo-infinities, and unnormals) but not the case of exponent is zero but explicit integer bit is one (pseudo-denormals). [For those following along at home, it is because floatx80 has an explicit integer bit rather than the implicit bit used in IEEE single and double that you can get 'unsupported' bit patterns, where the explicit bit is a value different from what the implicit bit would be under IEEE rules.] + +/* | Returns the result of converting the quadruple-precision floating-point | value `a' to the 32-bit two's complement integer format. The conversion | is performed according to the IEC/IEEE Standard for Binary Floating-Point @@ -6828,6 +6838,191 @@ floatx80 floatx80_scalbn( floatx80 a, int n STATUS_PARAM ) aSign, aExp, aSig, 0 STATUS_VAR ); } +/* executes single exponent reduction cycle */ +static uint64_t remainder_kernel(uint64_t aSig0, uint64_t bSig, int expDiff, + uint64_t *zSig0, uint64_t *zSig1) +{ +uint64_t term0, term1; +uint64_t aSig1 = 0; + +shortShift128Left(aSig1, aSig0, expDiff, aSig1, aSig0); +uint64_t q = estimateDiv128To64(aSig1, aSig0, bSig); Declaring variables in the middle of code isn't QEMU coding style; top of the function, please. [Other cases below; I haven't bothered to call them all out.] +mul64To128(bSig, q, term0, term1); +sub128(aSig1, aSig0, term0, term1, zSig1, zSig0); +while ((int64)(*zSig1) 0) { Cast to int64 is almost certainly wrong: this conditional will give different results if int64 is exactly 64 bits vs if it is more than 64 bits. You probably wanted int64_t. +--q; +add128(*zSig1, *zSig0, 0, bSig, zSig1, zSig0); +} +return q; +} + +static int do_fprem(floatx80 a, floatx80 b, floatx80 *r, uint64_t *q, +int rounding_mode STATUS_PARAM) +{ +int32 aExp, bExp, zExp, expDiff; +uint64_t aSig0, aSig1, bSig; +flag aSign; +*q = 0; + +/* handle unsupported extended double-precision floating encodings */ +if (floatx80_is_unsupported(a) || floatx80_is_unsupported(b)) { +float_raise(float_flag_invalid, status); Use STATUS_VAR. It's stupid, but let's be consistently stupid until we get round to globally fixing it. +*r = floatx80_default_nan; +return -1; +} + +aSig0 = extractFloatx80Frac(a); +aExp = extractFloatx80Exp(a); +aSign = extractFloatx80Sign(a); +bSig = extractFloatx80Frac(b); +bExp = extractFloatx80Exp(b); + +if (aExp == 0x7FFF) { +if ((uint64_t) (aSig01) || ((bExp == 0x7FFF) + (uint64_t) (bSig1))) { aSig0 and bSig are already uint64_t, why the casts? +*r = propagateFloatx80NaN(a, b, status); +return -1; +} +float_raise(float_flag_invalid, status); +*r = floatx80_default_nan; +return -1; +} +if (bExp == 0x7FFF) { +if ((uint64_t) (bSig1)) { +*r = propagateFloatx80NaN(a, b, status); +return -1; +} +if (aExp == 0
Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
On Fri, Jun 29, 2012 at 9:13 AM, Andreas Färber afaer...@suse.de wrote: Please run scripts/checkpatch.pl for CODING_STYLE issues, I spotted one slightly misplaced if brace, and one empty line at the bottom seemed to have indentation (git-am complains about that, too). Done. The only issue left is a false positive: ERROR: foo * bar should be foo *bar #248: FILE: fpu/softfloat.h:507: +int floatx80_remainder(floatx80, floatx80, floatx80 *, uint64_t * STATUS_PARAM); (STATUS_PARAM is not the name of the param, it's a macro.) The patch description will need to be cleaned up (not be letter-style). Done. I'm not too thrilled to introduce more uses of int32 (we started converting int16 to int_fast16_t) but I won't veto. It would be nice if you could review for any potential int32 vs. int32_t breakages though, to not make things worse than they are. I did notice one place where int32 was used where int should be used (the return variable of do_fprem), and that's now fixed. But for the other uses of int32, as I was saying to Peter, they are tied to the return type of extractFloatx80Exp. I wouldn't give the variable a different type than the function, and especially not in this case because the function should become int16_t, not int32_t. I'll be sending a PATCHv2 momentarily.
Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
On Mon, Jul 2, 2012 at 11:25 AM, Catalin Patulea catal...@google.com wrote: FPREM1 now passes the TestFloat floatx80_rem suite (and FPREM is implemented very similarly). The code (the bulk of which is remainder_kernel and do_fprem) is derived from Bochs SVN revision 11224 dated 2012-06-21 10:33:37 -0700, with conversions to Qemu type aliases, C features only, etc. as needed. Sorry about that, forgot to fix the subject line to read PATCHv2. Will fix for future patches.
Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
On 29 June 2012 02:50, Catalin Patulea catal...@google.com wrote: On Thu, Jun 28, 2012 at 2:25 PM, Peter Maydell peter.mayd...@linaro.org wrote: No new code should be using the uint64 c types (which are at least NN bits wide) -- uint64_t or uint_fast64_t please. Ok, changed some {int - flag} and {uint64 - uint64_t}. There are some int32s left, but these seem to be tied to extractFloatx80Exp's return type, which is int32 despite the underlying floatx80.high being an int16_t. Is this intentional? Fixing this would imply many other changes so I would punt on this for now. We haven't yet managed to do the full conversion away from the softfloat custom int32 types to the POSIX ones... + // handle unsupported extended double-precision floating encodings + if (floatx80_is_unsupported(a) || floatx80_is_unsupported(b)) + { + float_raise(float_flag_invalid, status); + *r = floatx80_default_nan; + return -1; + } So are we mishandling unsupported 80bit floats in other operations (eg addition), or do those functions just opencode things? Yes, I believe addition (and likely others) mishandles these. I put together a small C program that clears FP exceptions, adds two long doubles, and checks FP exceptions. (I do these things using glibc functions like feclearexcept - I don't think this is any different from doing it at the machine level, but just to be sure - are you aware of any issues with this approach?) Well, you're relying on the compiler and libc not to insert any other FPU operations into your code sequences, but it's probably OK. For ARM testing I use https://wiki.linaro.org/PeterMaydell/Risu , but getting that to properly support x86 is probably at least a few days work so if you're happy with your current testing approach that's fine. I'm not sure what you mean by opencode. do they just do the relevant checks as a bit of inline C (rather than calling out to a function to do the checking)? -- PMM
Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
Am 28.06.2012 01:00, schrieb Catalin Patulea: Hey guys, I've been hacking up the FPREM and FPREM1 i386 instructions (without KVM) to be implemented using SoftFloat only. This was motivated by some correctness issues that we noticed with the current implementation which follows the AMD datasheet.I believe the datasheet explains the operation as if intermediate results had infinite precision, but in this case intermediate rounding causes a loss of precision at the final output. My reference was TestFloat [1], specifically the floatx80_rem suite, which tests the FPREM instruction quite thoroughly (FPREM1 is technically untested but very similar). The current code fails about half the suite; with the patch, all tests pass. There are still lots of dark corners - the code originates from Bochs' copy of SoftFloat and I tried to port the Bochs exception handling logic as much as I could. Many details still elude me though (see comments in the code). TestFloat does test some of the exception logic but not as thoroughly as I would have liked. If anyone can offer some guidance here, I am happy to fix up the patch. That's about it.. let me know what you think. Catalin [1] http://www.jhauser.us/arithmetic/TestFloat.html --- fpu/softfloat.c | 182 +++ fpu/softfloat.h |3 + target-i386/op_helper.c | 163 +- 3 files changed, 249 insertions(+), 99 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index b29256a..9c6e4a3 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -5234,6 +5234,16 @@ int floatx80_unordered_quiet( floatx80 a, floatx80 b STATUS_PARAM ) } /* +| Returns 1 if the extended double-precision floating-point value `a' is an +| unsupported; otherwise returns 0. +**/ +int floatx80_is_unsupported(floatx80 a) +{ +return (extractFloatx80Exp(a) +!(extractFloatx80Frac(a) LIT64(0x8000))); +} Disclaimer: It's been a while that I had to learn IEEE 754. Unsupported for some i386 instruction? Or unsupported in the SoftFloat library due to some limitation? Please run scripts/checkpatch.pl for CODING_STYLE issues, I spotted one slightly misplaced if brace, and one empty line at the bottom seemed to have indentation (git-am complains about that, too). The patch description will need to be cleaned up (not be letter-style). I'm not too thrilled to introduce more uses of int32 (we started converting int16 to int_fast16_t) but I won't veto. It would be nice if you could review for any potential int32 vs. int32_t breakages though, to not make things worse than they are. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
On 29 June 2012 14:13, Andreas Färber afaer...@suse.de wrote: Am 28.06.2012 01:00, schrieb Catalin Patulea: /* +| Returns 1 if the extended double-precision floating-point value `a' is an +| unsupported; otherwise returns 0. +**/ +int floatx80_is_unsupported(floatx80 a) +{ + return (extractFloatx80Exp(a) + !(extractFloatx80Frac(a) LIT64(0x8000))); +} Disclaimer: It's been a while that I had to learn IEEE 754. Unsupported for some i386 instruction? Or unsupported in the SoftFloat library due to some limitation? It means 'unsupported' in the sense of Unsupported Double Extended-Precision Floating-Point Encodings, ie there are some bit patterns in a floatx80 which are not valid floating point numbers of any kind, and the Intel documentation calls these unsupported. See section 8.2.2 in Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 1 253665-034US. -- PMM
Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
On 28 June 2012 00:00, Catalin Patulea catal...@google.com wrote: Hey guys, I've been hacking up the FPREM and FPREM1 i386 instructions (without KVM) to be implemented using SoftFloat only. This was motivated by some correctness issues that we noticed with the current implementation which follows the AMD datasheet.I believe the datasheet explains the operation as if intermediate results had infinite precision, but in this case intermediate rounding causes a loss of precision at the final output. My reference was TestFloat [1], specifically the floatx80_rem suite, which tests the FPREM instruction quite thoroughly (FPREM1 is technically untested but very similar). The current code fails about half the suite; with the patch, all tests pass. There are still lots of dark corners - the code originates from Bochs' copy of SoftFloat and I tried to port the Bochs exception handling logic as much as I could. Many details still elude me though (see comments in the code). TestFloat does test some of the exception logic but not as thoroughly as I would have liked. If anyone can offer some guidance here, I am happy to fix up the patch. That's about it.. let me know what you think. Catalin [1] http://www.jhauser.us/arithmetic/TestFloat.html --- fpu/softfloat.c | 182 +++ fpu/softfloat.h | 3 + target-i386/op_helper.c | 163 +- 3 files changed, 249 insertions(+), 99 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index b29256a..9c6e4a3 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -5234,6 +5234,16 @@ int floatx80_unordered_quiet( floatx80 a, floatx80 b STATUS_PARAM ) } /* +| Returns 1 if the extended double-precision floating-point value `a' is an +| unsupported; otherwise returns 0. an unsupported what? +**/ +int floatx80_is_unsupported(floatx80 a) +{ + return (extractFloatx80Exp(a) + !(extractFloatx80Frac(a) LIT64(0x8000))); +} + +/* | Returns the result of converting the quadruple-precision floating-point | value `a' to the 32-bit two's complement integer format. The conversion | is performed according to the IEC/IEEE Standard for Binary Floating-Point @@ -6828,6 +6838,178 @@ floatx80 floatx80_scalbn( floatx80 a, int n STATUS_PARAM ) aSign, aExp, aSig, 0 STATUS_VAR ); } +/* + * BEGIN from Bochs rev 11224 dated 2012-06-21 10:33:37 -0700 + * + * Converted to use Qemu type aliases, use C features only, etc. + */ I'm not convinced we want these markers in the source code (though they should probably be in the commit message). Can you confirm that the Bochs licence is compatible with the QEMU one? In particular, is it compatible with/the same as the license as stated at the top of softfloat.c? + +/* executes single exponent reduction cycle */ +static uint64 remainder_kernel(uint64 aSig0, uint64 bSig, int expDiff, uint64 *zSig0, uint64 *zSig1) +{ + uint64 term0, term1; + uint64 aSig1 = 0; No new code should be using the uint64 c types (which are at least NN bits wide) -- uint64_t or uint_fast64_t please. + + shortShift128Left(aSig1, aSig0, expDiff, aSig1, aSig0); + uint64 q = estimateDiv128To64(aSig1, aSig0, bSig); + mul64To128(bSig, q, term0, term1); + sub128(aSig1, aSig0, term0, term1, zSig1, zSig0); + while ((int64)(*zSig1) 0) { + --q; + add128(*zSig1, *zSig0, 0, bSig, zSig1, zSig0); + } + return q; +} + +static int do_fprem(floatx80 a, floatx80 b, floatx80 *r, uint64 *q, int rounding_mode STATUS_PARAM ) +{ + int32 aExp, bExp, zExp, expDiff; + uint64 aSig0, aSig1, bSig; + int aSign; + *q = 0; + + // handle unsupported extended double-precision floating encodings + if (floatx80_is_unsupported(a) || floatx80_is_unsupported(b)) + { + float_raise(float_flag_invalid, status); + *r = floatx80_default_nan; + return -1; + } So are we mishandling unsupported 80bit floats in other operations (eg addition), or do those functions just opencode things? + aSig0 = extractFloatx80Frac(a); + aExp = extractFloatx80Exp(a); + aSign = extractFloatx80Sign(a); + bSig = extractFloatx80Frac(b); + bExp = extractFloatx80Exp(b); + + if (aExp == 0x7FFF) { + if ((uint64) (aSig01) || ((bExp == 0x7FFF) (uint64) (bSig1))) { + *r = propagateFloatx80NaN(a, b, status); + return -1; + } + float_raise(float_flag_invalid, status); + *r = floatx80_default_nan; + return -1; + } + if (bExp == 0x7FFF) { + if ((uint64)
Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
On Thu, Jun 28, 2012 at 2:25 PM, Peter Maydell peter.mayd...@linaro.org wrote: /* +| Returns 1 if the extended double-precision floating-point value `a' is an +| unsupported; otherwise returns 0. an unsupported what? I think it should be unsupported value - fixed. +/* + * BEGIN from Bochs rev 11224 dated 2012-06-21 10:33:37 -0700 + * + * Converted to use Qemu type aliases, use C features only, etc. + */ I'm not convinced we want these markers in the source code (though they should probably be in the commit message). Ok, moved to the commit message. Can you confirm that the Bochs licence is compatible with the QEMU one? In particular, is it compatible with/the same as the license as stated at the top of softfloat.c? Bochs is licensed under LGPL 2.1, and QEMU is GPL 2. According to the GNU license compatibility matrix [1], it appears to be allowed to copy code from Bochs to QEMU. However, each copy of SoftFloat is under its own license, which has a warranty disclaimer and the following: Derivative works are acceptable, even for commercial purposes, so long as (1) the source code for the derivative work includes prominent notice that the work is derivative, and (2) the source code includes prominent notice with these four paragraphs for those parts of this code that are retained. I'm not sure which license takes precedence. Perhaps because each project has modified SoftFloat, each copy is now licensed under the broader project license. In either case, I don't think there are any issues (though IANAL). +static uint64 remainder_kernel(uint64 aSig0, uint64 bSig, int expDiff, uint64 *zSig0, uint64 *zSig1) +{ + uint64 term0, term1; + uint64 aSig1 = 0; No new code should be using the uint64 c types (which are at least NN bits wide) -- uint64_t or uint_fast64_t please. Ok, changed some {int - flag} and {uint64 - uint64_t}. There are some int32s left, but these seem to be tied to extractFloatx80Exp's return type, which is int32 despite the underlying floatx80.high being an int16_t. Is this intentional? Fixing this would imply many other changes so I would punt on this for now. + // handle unsupported extended double-precision floating encodings + if (floatx80_is_unsupported(a) || floatx80_is_unsupported(b)) + { + float_raise(float_flag_invalid, status); + *r = floatx80_default_nan; + return -1; + } So are we mishandling unsupported 80bit floats in other operations (eg addition), or do those functions just opencode things? Yes, I believe addition (and likely others) mishandles these. I put together a small C program that clears FP exceptions, adds two long doubles, and checks FP exceptions. (I do these things using glibc functions like feclearexcept - I don't think this is any different from doing it at the machine level, but just to be sure - are you aware of any issues with this approach?) The operands were (decimal notation, then hexadecimal exponent.significand): st0: 1. 3FFF.8000 + st1: 0. 3FFF. // Intel IA-32 [2] defines the latter as a positive unnormal (Table 8-3), one of those unsupported values. The result on bare metal was: result: -nan .C000 (with FE_INVALID) qemu i386-linux-user kvm-less: result: 3. 4000.C000 (no exceptions) I'm not sure what you mean by opencode. +/* +| Returns the remainder of the extended double-precision floating-point value +| `a' with respect to the corresponding value `b'. The operation is performed +| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic. +**/ This claims to return the remainder, but do_fprem only returns 0, 1 or -1... Ok, fixed the comments. + /* TODO(catalinp): + * + * - What is the defined purpose of fp_status in Qemu? Seems many things + * write into it, but few if any translate into env-fpus. fp_status is a float_status structure which has two purposes: * tells the softfloat code how to behave in certain circumstances (how to handle denormals, rounding modes, etc) * tracks the IEEE cumulative exception flags (inexact, overflow, etc) Every softfloat routine which needs to set exception flags or behave differently according to the flags in fp_status will take a float_status* argument. (This is obscured slightly by the silly STATUS_PARAM #define.) (Note that for some CPU architectures there may be more than one float_status struct; eg ARM has both an fp_status and a standard_fp_status, because Neon operations behave differently from VFP ones; see comments in target-arm/cpu.h. The Neon operations pass the softfloat code the standard_fp_status, and the VFP ops
Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only
Am 28.06.2012 01:00, schrieb Catalin Patulea: Hey guys, I've been hacking up the FPREM and FPREM1 i386 instructions (without KVM) to be implemented using SoftFloat only. This was motivated by some correctness issues that we noticed with the current implementation which follows the AMD datasheet.I believe the datasheet explains the operation as if intermediate results had infinite precision, but in this case intermediate rounding causes a loss of precision at the final output. My reference was TestFloat [1], specifically the floatx80_rem suite, which tests the FPREM instruction quite thoroughly (FPREM1 is technically untested but very similar). The current code fails about half the suite; with the patch, all tests pass. There are still lots of dark corners - the code originates from Bochs' copy of SoftFloat and I tried to port the Bochs exception handling logic as much as I could. Many details still elude me though (see comments in the code). TestFloat does test some of the exception logic but not as thoroughly as I would have liked. If anyone can offer some guidance here, I am happy to fix up the patch. That's about it.. let me know what you think. Catalin [1] http://www.jhauser.us/arithmetic/TestFloat.html Hi Catalin, could you please check your patch using the latest scripts/checkpatch.pl? There are several style issues which should be fixed. I suggest to fix them even in code which was copied from Bochs. Thanks, Stefan W.