On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:
On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:
We need 2 new pragmas with the same syntax as `pragma(inline,
xxx)`:
1. `pragma(fusedMath)` allows fused mul-add, mul-sub, div-add,
div-sub operations.
2. `pragma(fastMath)`
On 8/6/2016 11:45 PM, Ilya Yaroshenko wrote:
So it does make sense that allowing fused operations would be equivalent to
having no maximum precision.
Fused operations are mul/div+add/sub only.
Fused operations does not break compesator subtraction:
auto t = a - x + x;
So, please, make them as
On 6 August 2016 at 22:12, David Nadlinger via Digitalmars-d
wrote:
> On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
>>
>> No pragmas tied to a specific architecture should be allowed in the
>> language spec, please.
>
>
> I wholeheartedly agree.
On Saturday, 6 August 2016 at 22:32:08 UTC, Walter Bright wrote:
On 8/6/2016 3:14 PM, David Nadlinger wrote:
Of course, if floating point values are strictly defined as
having only a
minimum precision, then folding away the rounding after the
multiplication is
always legal.
Yup.
So it does
On Saturday, 6 August 2016 at 21:56:06 UTC, Walter Bright wrote:
On 8/6/2016 1:06 PM, Ilya Yaroshenko wrote:
Some applications requires exactly the same results for
different architectures
(probably because business requirement). So this optimization
is turned off by
default in LDC for
On 8/6/2016 3:14 PM, David Nadlinger wrote:
On Saturday, 6 August 2016 at 21:56:06 UTC, Walter Bright wrote:
Let me rephrase the question - how does fusing them alter the result?
There is just one rounding operation instead of two.
Makes sense.
Of course, if floating point values are
On Saturday, 6 August 2016 at 21:56:06 UTC, Walter Bright wrote:
Let me rephrase the question - how does fusing them alter the
result?
There is just one rounding operation instead of two.
Of course, if floating point values are strictly defined as
having only a minimum precision, then
On 8/6/2016 2:12 PM, David Nadlinger wrote:
This is true – and precisely the reason why it is actually defined
(ldc.attributes) as
---
alias fastmath = AliasSeq!(llvmAttr("unsafe-fp-math", "true"),
llvmFastMathFlag("fast"));
---
This way, users can actually combine different optimisations in a
On 8/6/2016 1:06 PM, Ilya Yaroshenko wrote:
Some applications requires exactly the same results for different architectures
(probably because business requirement). So this optimization is turned off by
default in LDC for example.
Let me rephrase the question - how does fusing them alter the
On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:
The LDC fastmath bothers me a lot. It throws away proper NaN
and infinity handling, and throws away precision by allowing
reciprocal and algebraic transformations.
This is true – and precisely the reason why it is actually
On Saturday, 6 August 2016 at 12:48:26 UTC, Iain Buclaw wrote:
There are compiler switches for that. Maybe there should be
one pragma to tweak these compiler switches on a per-function
basis, rather than separately named pragmas.
This might be a solution for inherently compiler-specific
On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
No pragmas tied to a specific architecture should be allowed in
the language spec, please.
I wholeheartedly agree. However, it's not like FP optimisation
pragmas would be specific to any particular architecture. They
just
On Saturday, 6 August 2016 at 19:51:11 UTC, Walter Bright wrote:
On 8/6/2016 2:48 AM, Ilya Yaroshenko wrote:
I don't know what the point of fusedMath is.
It allows a compiler to replace two arithmetic operations with
single composed
one, see AVX2 (FMA3 for intel and FMA4 for AMD) instruction
On 8/6/2016 2:48 AM, Ilya Yaroshenko wrote:
I don't know what the point of fusedMath is.
It allows a compiler to replace two arithmetic operations with single composed
one, see AVX2 (FMA3 for intel and FMA4 for AMD) instruction set.
I understand that, I just don't understand why that wouldn't
On 8/6/2016 3:02 AM, Iain Buclaw via Digitalmars-d wrote:
No pragmas tied to a specific architecture should be allowed in the
language spec, please.
A good point. On the other hand, a list of them would be nice so implementations
don't step on each other.
On 8/6/2016 5:09 AM, Johannes Pfau wrote:
I think this restriction is also quite arbitrary.
You're right that there are gray areas, but the distinction is not arbitrary.
For example, mangling does not affect the interface. It affects the name.
Using an attribute has more downsides, as it
On 6 August 2016 at 16:11, Patrick Schluter via Digitalmars-d
wrote:
> On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
>>
>> On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
>> wrote:
>>>
>>> On Saturday, 6
On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
wrote:
On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright
wrote:
No pragmas tied to a specific architecture should be allowed in
On 6 August 2016 at 13:30, Ilya Yaroshenko via Digitalmars-d
wrote:
> On Saturday, 6 August 2016 at 11:10:18 UTC, Iain Buclaw wrote:
>>
>> On 6 August 2016 at 12:07, Ilya Yaroshenko via Digitalmars-d
>> wrote:
>>>
>>> On Saturday, 6
Am Sat, 6 Aug 2016 02:29:50 -0700
schrieb Walter Bright :
> On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:
> > On Friday, 5 August 2016 at 20:53:42 UTC, Walter Bright wrote:
> >
> >> I agree that the typical summation algorithm suffers from double
> >> rounding. But
On Saturday, 6 August 2016 at 11:10:18 UTC, Iain Buclaw wrote:
On 6 August 2016 at 12:07, Ilya Yaroshenko via Digitalmars-d
wrote:
On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
On 6 August 2016 at 12:07, Ilya Yaroshenko via Digitalmars-d
wrote:
> On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
>>
>> On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
>> wrote:
>>>
>>> On Saturday, 6
On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
wrote:
On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright
wrote:
[...]
OK, then we need a third pragma,`pragma(ieeeRound)`. But
On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
wrote:
> On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:
>>
>> On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:
>>>
>>> We need 2 new pragmas with the same syntax as `pragma(inline, xxx)`:
>>>
On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:
On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:
We need 2 new pragmas with the same syntax as `pragma(inline,
xxx)`:
1. `pragma(fusedMath)` allows fused mul-add, mul-sub, div-add,
div-sub operations.
2. `pragma(fastMath)`
On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:
We need 2 new pragmas with the same syntax as `pragma(inline, xxx)`:
1. `pragma(fusedMath)` allows fused mul-add, mul-sub, div-add, div-sub
operations.
2. `pragma(fastMath)` equivalents to [1]. This pragma can be used to allow
extended precision.
On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:
On Friday, 5 August 2016 at 20:53:42 UTC, Walter Bright wrote:
I agree that the typical summation algorithm suffers from double rounding. But
that's one algorithm. I would appreciate if you would review
On 4 August 2016 at 23:38, Seb via Digitalmars-d
wrote:
> On Thursday, 4 August 2016 at 21:13:23 UTC, Iain Buclaw wrote:
>>
>> On 4 August 2016 at 01:00, Seb via Digitalmars-d
>> wrote:
>>>
>>> To make matters worse std.math yields
On Friday, 5 August 2016 at 20:53:42 UTC, Walter Bright wrote:
I agree that the typical summation algorithm suffers from
double rounding. But that's one algorithm. I would appreciate
if you would review
http://dlang.org/phobos/std_algorithm_iteration.html#sum to
ensure it doesn't have this
On 8/5/2016 2:40 AM, Ilya Yaroshenko wrote:
No. For example std.math.log requires it! But you don't care about other
compilers which not use yl2x and about making it template (real version slows
down code for double and float).
I'm interested in correct to the last bit results first, and
On 8/5/2016 4:27 AM, Seb wrote:
1) There are some function (exp, pow, log, round, sqrt) for which using
llvm_intrinsincs significantly increases your performance.
It's a simple benchmark and might be flawed, but I hope it shows the point.
Speed is not the only criteria. Accuracy is as well.
Thanks for finding these.
On 8/5/2016 3:22 AM, Ilya Yaroshenko wrote:
1. https://www.python.org/ftp/python/3.5.2/Python-3.5.2.tgz
mathmodule.c, math_fsum has comment:
Depends on IEEE 754 arithmetic guarantees and half-even rounding.
The same algorithm also available in Mir. And it does not
On Friday, 5 August 2016 at 09:21:53 UTC, Ilya Yaroshenko wrote:
On Friday, 5 August 2016 at 08:43:48 UTC, deadalnix wrote:
On Friday, 5 August 2016 at 08:17:00 UTC, Ilya Yaroshenko
wrote:
1. Could you please provide an assembler example with clang
or recent gcc?
I have better: compile your
On Friday, 5 August 2016 at 09:40:59 UTC, Ilya Yaroshenko wrote:
On Friday, 5 August 2016 at 09:24:49 UTC, Walter Bright wrote:
On 8/5/2016 12:43 AM, Ilya Yaroshenko wrote:
You are wrong that there are far fewer of those cases. This
is naive point of
view. A lot of netlib math functions
Here is a relevant example:
https://hal.inria.fr/inria-00171497v1/document
It is used in at least one real world geometric modeling system.
On Friday, 5 August 2016 at 09:40:23 UTC, Walter Bright wrote:
On 8/5/2016 12:43 AM, Ilya Yaroshenko wrote:
You are wrong that there are far fewer of those cases. This is
naive point of
view. A lot of netlib math functions require exact IEEE
arithmetic. Tinflex
requires it. Python C backend
On Friday, 5 August 2016 at 09:24:49 UTC, Walter Bright wrote:
On 8/5/2016 12:43 AM, Ilya Yaroshenko wrote:
You are wrong that there are far fewer of those cases. This is
naive point of
view. A lot of netlib math functions require exact IEEE
arithmetic. Tinflex
requires it. Python C backend
On 8/5/2016 12:43 AM, Ilya Yaroshenko wrote:
You are wrong that there are far fewer of those cases. This is naive point of
view. A lot of netlib math functions require exact IEEE arithmetic. Tinflex
requires it. Python C backend and Mir library require exact IEEE arithmetic.
Atmosphere package
On Friday, 5 August 2016 at 08:43:48 UTC, deadalnix wrote:
On Friday, 5 August 2016 at 08:17:00 UTC, Ilya Yaroshenko wrote:
1. Could you please provide an assembler example with clang or
recent gcc?
I have better: compile your favorite project with
-Wdouble-promotion and enjoy the rain of
On 8/5/2016 1:17 AM, Ilya Yaroshenko wrote:
2. C compilers not promote double to 80-bit reals anyway.
Java originally came out with an edict that floats will all be done in float
precision, and double in double.
Sun had evidently never used an x87 before, because it soon became obvious that
On 8/5/2016 12:43 AM, Ilya Yaroshenko wrote:
You are wrong that there are far fewer of those cases. This is naive point of
view. A lot of netlib math functions require exact IEEE arithmetic. Tinflex
requires it. Python C backend and Mir library require exact IEEE arithmetic.
Atmosphere package
On Friday, 5 August 2016 at 08:43:48 UTC, deadalnix wrote:
On Friday, 5 August 2016 at 08:17:00 UTC, Ilya Yaroshenko wrote:
1. Could you please provide an assembler example with clang or
recent gcc?
I have better: compile your favorite project with
-Wdouble-promotion and enjoy the rain of
On Friday, 5 August 2016 at 08:17:00 UTC, Ilya Yaroshenko wrote:
1. Could you please provide an assembler example with clang or
recent gcc?
I have better: compile your favorite project with
-Wdouble-promotion and enjoy the rain of warnings.
But try it yourself:
float foo(float a, float b)
On Friday, 5 August 2016 at 07:59:15 UTC, deadalnix wrote:
On Friday, 5 August 2016 at 07:43:19 UTC, Ilya Yaroshenko wrote:
You are wrong that there are far fewer of those cases. This is
naive point of view. A lot of netlib math functions require
exact IEEE arithmetic. Tinflex requires it.
On Friday, 5 August 2016 at 07:43:19 UTC, Ilya Yaroshenko wrote:
On Friday, 5 August 2016 at 06:59:21 UTC, Walter Bright wrote:
On 8/4/2016 11:05 PM, Fool wrote:
I understand your point of view. However, there are (probably
rare) situations
where one requires more control. I think that
On Friday, 5 August 2016 at 06:59:21 UTC, Walter Bright wrote:
On 8/4/2016 11:05 PM, Fool wrote:
I understand your point of view. However, there are (probably
rare) situations
where one requires more control. I think that simulating
double-double precision
arithmetic using Veltkamp split was
On 8/4/2016 11:05 PM, Fool wrote:
I understand your point of view. However, there are (probably rare) situations
where one requires more control. I think that simulating double-double precision
arithmetic using Veltkamp split was mentioned as a resonable example, earlier.
There are cases where
On Thursday, 4 August 2016 at 20:58:57 UTC, Walter Bright wrote:
On 8/4/2016 1:29 PM, Fool wrote:
I'm afraid, I don't understand your implementation. Isn't
toFloat(x) +
toFloat(y) computed in real precision (first rounding)? Why
doesn't
toFloat(toFloat(x) + toFloat(y)) involve another
On Thursday, 4 August 2016 at 21:13:23 UTC, Iain Buclaw wrote:
On 4 August 2016 at 01:00, Seb via Digitalmars-d
wrote:
To make matters worse std.math yields different results than
compiler/assembly intrinsics - note that in this example
import std.math.pow adds
On 8/4/2016 2:13 PM, Iain Buclaw via Digitalmars-d wrote:
This could be something specific to your architecture. I get the same
result on from all versions of powf, and from GCC builtins too,
regardless of optimization tunings.
It's important to remember that what gcc does and what the C
On 4 August 2016 at 01:00, Seb via Digitalmars-d
wrote:
>
> Consider the following program, it fails on 32-bit :/
>
It would be nice if explicit casts were honoured by CTFE here.
toDouble(a + b) just seems to be avoiding the why CTFE ignores the
cast in
On 8/4/2016 1:29 PM, Fool wrote:
I'm afraid, I don't understand your implementation. Isn't toFloat(x) +
toFloat(y) computed in real precision (first rounding)? Why doesn't
toFloat(toFloat(x) + toFloat(y)) involve another rounding?
You're right, in that case, it does. But C does, too:
IEEE behaviour by default is required by numeric software.
@fastmath (like recent LDC) or something like that can be used to
allow extended precision.
Ilya
On 8/4/2016 1:03 PM, deadalnix wrote:
It is actually very common for C compiler to work with double for intermediate
values, which isn't far from what D does.
In fact, it used to be specified that C behave that way!
On Thursday, 4 August 2016 at 20:00:14 UTC, Walter Bright wrote:
On 8/4/2016 12:03 PM, Fool wrote:
How can we ensure that toFloat(toFloat(x) + toFloat(y)) does
not involve
double-rounding?
It's the whole point of it.
I'm afraid, I don't understand your implementation. Isn't
toFloat(x) +
On 8/4/2016 12:03 PM, Fool wrote:
How can we ensure that toFloat(toFloat(x) + toFloat(y)) does not involve
double-rounding?
It's the whole point of it.
On Thursday, 4 August 2016 at 18:53:23 UTC, Walter Bright wrote:
On 8/4/2016 7:08 AM, Andrew Godfrey wrote:
Now, my major experience is in the context of Intel non-SIMD
FP, where internal
precision is 80-bit. I can see the appeal of asking for the
ability to reduce
internal precision to match
On 8/4/2016 11:53 AM, Walter Bright wrote:
It has been proposed many times that the solution for D is to have a function
called toFloat() or something like that in core.math, which guarantees a round
to float precision for its argument. But so far nobody has written such a
function.
On Thursday, 4 August 2016 at 18:53:23 UTC, Walter Bright wrote:
It has been proposed many times that the solution for D is to
have a function called toFloat() or something like that in
core.math, which guarantees a round to float precision for its
argument. But so far nobody has written such
On 8/4/2016 7:08 AM, Andrew Godfrey wrote:
Now, my major experience is in the context of Intel non-SIMD FP, where internal
precision is 80-bit. I can see the appeal of asking for the ability to reduce
internal precision to match the data type you're using, and I think I've read
something written
On Wednesday, 3 August 2016 at 23:00:11 UTC, Seb wrote:
There was a recent discussion on Phobos about D's floating
point behavior [1]. I think Ilya summarized quite elegantly our
problem:
[...]
In my experience (production-quality FP coding in C++), you are
in error merely by combining
61 matches
Mail list logo