Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Iain Buclaw via Digitalmars-d
On 6 August 2016 at 22:12, David Nadlinger via Digitalmars-d
 wrote:
> On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
>>
>> No pragmas tied to a specific architecture should be allowed in the
>> language spec, please.
>
>
> I wholeheartedly agree. However, it's not like FP optimisation pragmas would
> be specific to any particular architecture. They just describe classes of
> transformations that are allowed on top of the standard semantics.
>
> For example, whether transforming `a + (b * c)` into a single operation is
> allowed is not a question of the target architecture at all, but rather
> whether the implicit rounding after evaluating (b * c) can be skipped or
> not. While this in turn of course enables the compiler to use FMA
> instructions on x86/AVX, ARM/NEON, PPC, …, it is not architecture-specific
> at all on a conceptual level.
>

Well, you get fusedMath for free when turning on -mfma or -mfused-madd
- whatever is most relevant for the target.

Try adding -mfma here.  http://goo.gl/xsvDXM



Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Ilya Yaroshenko via Digitalmars-d

On Saturday, 6 August 2016 at 22:32:08 UTC, Walter Bright wrote:

On 8/6/2016 3:14 PM, David Nadlinger wrote:
Of course, if floating point values are strictly defined as 
having only a
minimum precision, then folding away the rounding after the 
multiplication is

always legal.


Yup.

So it does make sense that allowing fused operations would be 
equivalent to having no maximum precision.


Fused operations are mul/div+add/sub only.
Fused operations does not break compesator subtraction:
auto t = a - x + x;
So, please, make them as separate pragma.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Ilya Yaroshenko via Digitalmars-d

On Saturday, 6 August 2016 at 21:56:06 UTC, Walter Bright wrote:

On 8/6/2016 1:06 PM, Ilya Yaroshenko wrote:
Some applications requires exactly the same results for 
different architectures
(probably because business requirement). So this optimization 
is turned off by

default in LDC for example.


Let me rephrase the question - how does fusing them alter the 
result?


The result became more precise, because single rounding instead 
of two.


Re: The Computer Language Benchmarks Game

2016-08-06 Thread Walter Bright via Digitalmars-d

On 8/5/2016 7:02 AM, qznc wrote:

Ultimately, my opinion is that the benchmark is outdated and not useful today. I
ignore it, if anybody cites the benchmark game for performance measurements.


Yeah, I wouldn't bother with it, either.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Walter Bright via Digitalmars-d

On 8/6/2016 3:14 PM, David Nadlinger wrote:

On Saturday, 6 August 2016 at 21:56:06 UTC, Walter Bright wrote:

Let me rephrase the question - how does fusing them alter the result?


There is just one rounding operation instead of two.


Makes sense.



Of course, if floating point values are strictly defined as having only a
minimum precision, then folding away the rounding after the multiplication is
always legal.


Yup.

So it does make sense that allowing fused operations would be equivalent to 
having no maximum precision.




Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread David Nadlinger via Digitalmars-d

On Saturday, 6 August 2016 at 21:56:06 UTC, Walter Bright wrote:
Let me rephrase the question - how does fusing them alter the 
result?


There is just one rounding operation instead of two.

Of course, if floating point values are strictly defined as 
having only a minimum precision, then folding away the rounding 
after the multiplication is always legal.


 — David


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Walter Bright via Digitalmars-d

On 8/6/2016 2:12 PM, David Nadlinger wrote:

This is true – and precisely the reason why it is actually defined
(ldc.attributes) as

---
alias fastmath = AliasSeq!(llvmAttr("unsafe-fp-math", "true"),
llvmFastMathFlag("fast"));
---

This way, users can actually combine different optimisations in a more tasteful
manner as appropriate for their particular application.

Experience has shown that people – even those intimately familiar with FP
semantics – expect a catch-all kitchen-sink switch for all natural optimisations
(natural when equating FP values with real numbers). This is why the shorthand
exists.


I didn't know that, thanks for the explanation. But the same can be done for 
pragmas, as the second argument isn't just true|false, it's an expression.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Walter Bright via Digitalmars-d

On 8/6/2016 1:06 PM, Ilya Yaroshenko wrote:

Some applications requires exactly the same results for different architectures
(probably because business requirement). So this optimization is turned off by
default in LDC for example.


Let me rephrase the question - how does fusing them alter the result?


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread David Nadlinger via Digitalmars-d

On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:
The LDC fastmath bothers me a lot. It throws away proper NaN 
and infinity handling, and throws away precision by allowing 
reciprocal and algebraic transformations.


This is true – and precisely the reason why it is actually 
defined (ldc.attributes) as


---
alias fastmath = AliasSeq!(llvmAttr("unsafe-fp-math", "true"), 
llvmFastMathFlag("fast"));

---

This way, users can actually combine different optimisations in a 
more tasteful manner as appropriate for their particular 
application.


Experience has shown that people – even those intimately familiar 
with FP semantics – expect a catch-all kitchen-sink switch for 
all natural optimisations (natural when equating FP values with 
real numbers). This is why the shorthand exists.


 — David


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread David Nadlinger via Digitalmars-d

On Saturday, 6 August 2016 at 12:48:26 UTC, Iain Buclaw wrote:
There are compiler switches for that.  Maybe there should be 
one pragma to tweak these compiler switches on a per-function 
basis, rather than separately named pragmas.


This might be a solution for inherently compiler-specific 
settings (although for LDC we would probably go for "type-safe" 
UDAs/pragmas instead of parsing faux command-line strings).


Floating point transformation semantics aren't compiler-specific, 
though. The corresponding options are used commonly enough in 
certain kinds of code that it doesn't seem prudent to require 
users to resort to compiler-specific ways of expressing them.


 — David


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread David Nadlinger via Digitalmars-d

On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
No pragmas tied to a specific architecture should be allowed in 
the language spec, please.


I wholeheartedly agree. However, it's not like FP optimisation 
pragmas would be specific to any particular architecture. They 
just describe classes of transformations that are allowed on top 
of the standard semantics.


For example, whether transforming `a + (b * c)` into a single 
operation is allowed is not a question of the target architecture 
at all, but rather whether the implicit rounding after evaluating 
(b * c) can be skipped or not. While this in turn of course 
enables the compiler to use FMA instructions on x86/AVX, 
ARM/NEON, PPC, …, it is not architecture-specific at all on a 
conceptual level.


 — David


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Ilya Yaroshenko via Digitalmars-d

On Saturday, 6 August 2016 at 19:51:11 UTC, Walter Bright wrote:

On 8/6/2016 2:48 AM, Ilya Yaroshenko wrote:

I don't know what the point of fusedMath is.
It allows a compiler to replace two arithmetic operations with 
single composed
one, see AVX2 (FMA3 for intel and FMA4 for AMD) instruction 
set.


I understand that, I just don't understand why that wouldn't be 
done anyway.


Some applications requires exactly the same results for different 
architectures (probably because business requirement). So this 
optimization is turned off by default in LDC for example.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Walter Bright via Digitalmars-d

On 8/6/2016 2:48 AM, Ilya Yaroshenko wrote:

I don't know what the point of fusedMath is.

It allows a compiler to replace two arithmetic operations with single composed
one, see AVX2 (FMA3 for intel and FMA4 for AMD) instruction set.


I understand that, I just don't understand why that wouldn't be done anyway.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Walter Bright via Digitalmars-d

On 8/6/2016 3:02 AM, Iain Buclaw via Digitalmars-d wrote:

No pragmas tied to a specific architecture should be allowed in the
language spec, please.



A good point. On the other hand, a list of them would be nice so implementations 
don't step on each other.




Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Walter Bright via Digitalmars-d

On 8/6/2016 5:09 AM, Johannes Pfau wrote:

I think this restriction is also quite arbitrary.


You're right that there are gray areas, but the distinction is not arbitrary.

For example, mangling does not affect the interface. It affects the name.

Using an attribute has more downsides, as it affects the whole function rather 
than just part of it, like a pragma would.




Re: D safety! New Feature?

2016-08-06 Thread Chris Wright via Digitalmars-d
On Sat, 06 Aug 2016 07:56:29 +0200, ag0aep6g wrote:

> On 08/06/2016 03:38 AM, Chris Wright wrote:
>> Some reflection stuff is a bit inconvenient:
>>
>> class A {
>>   int foo() { return 1; }
>> }
>>
>> void main() {
>>   auto a = new immutable(A);
>>   // This passes:
>>   static assert(is(typeof(a.foo)));
>>   // This doesn't:
>>   static assert(__traits(compiles, () { a.foo; }));
>> }
>>
>> __traits(compiles) is mostly an evil hack, but things like this require
>> its use.
> 
> The two are not equivalent, though. The first one checks the type of the
> method.

Which is a bit awkward because in no other context is it possible to 
mention a raw method. You invoke it or you get a delegate to it.

And if you try to get a delegate explicitly to avoid this, you run into 
https://issues.dlang.org/show_bug.cgi?id=1983 .

It's also a bit awkward for a method to be reported to exist with no way 
to call it. While you can do the same with private methods, you get a 
warning:

a.d(7): Deprecation: b.B.priv is not visible from module a

Which implies that this will become illegal at some point and fail to 
compile.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Iain Buclaw via Digitalmars-d
On 6 August 2016 at 16:11, Patrick Schluter via Digitalmars-d
 wrote:
> On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
>>
>> On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
>>  wrote:
>>>
>>> On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:


>>
>> No pragmas tied to a specific architecture should be allowed in the
>> language spec, please.
>
>
> Hmmm, that's the whole point of pragmas (at least in C) to specify
> implementation specific stuff outside of the language specs. If it's in the
> language specs it should be done with language specific mechanisms.

https://dlang.org/spec/pragma.html#predefined-pragmas

"""
All implementations must support these, even if by just ignoring them.
...
Vendor specific pragma Identifiers can be defined if they are prefixed
by the vendor's trademarked name, in a similar manner to version
identifiers.
"""

So all added pragmas that have no vendor prefix must be treated as
part of the language in order to conform with the specs.


Re: Recommended procedure to upgrade DMD installation

2016-08-06 Thread A D dev via Digitalmars-d

On Saturday, 6 August 2016 at 01:22:51 UTC, Mike Parker wrote:


DMD ships with the OPTLINK linker and uses it by default.



You generally don't need to worry about calling them directly


Got it, thanks.



Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Patrick Schluter via Digitalmars-d

On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d 
 wrote:
On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright 
wrote:




No pragmas tied to a specific architecture should be allowed in 
the language spec, please.


Hmmm, that's the whole point of pragmas (at least in C) to 
specify implementation specific stuff outside of the language 
specs. If it's in the language specs it should be done with 
language specific mechanisms.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Iain Buclaw via Digitalmars-d
On 6 August 2016 at 13:30, Ilya Yaroshenko via Digitalmars-d
 wrote:
> On Saturday, 6 August 2016 at 11:10:18 UTC, Iain Buclaw wrote:
>>
>> On 6 August 2016 at 12:07, Ilya Yaroshenko via Digitalmars-d
>>  wrote:
>>>
>>> On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:


 On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
  wrote:
>
>
> On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:
>>
>>
>> [...]
>
>
>
>
> OK, then we need a third pragma,`pragma(ieeeRound)`. But
> `pragma(fusedMath)`
> and `pragma(fastMath)` should be presented too.
>
>> [...]
>
>
>
>
> It allows a compiler to replace two arithmetic operations with single
> composed one, see AVX2 (FMA3 for intel and FMA4 for AMD) instruction set.



 No pragmas tied to a specific architecture should be allowed in the
 language spec, please.
>>>
>>>
>>>
>>> Then probably Mir will drop all compilers, but LDC
>>> LLVM is tied for real world, so we can tied D for real world too. If a
>>> compiler can not implement optimization pragma, then this pragma can be
>>> just
>>> ignored by the compiler.
>>
>>
>> If you need a function to work with an exclusive instruction set or
>> something as specific as use of composed/fused instructions, then it is
>> common to use an indirect function resolver to choose the most relevant
>> implementation for the system that's running the code (a la @ifunc), then
>> for the targetted fusedMath implementation, do it yourself.
>
>
> What do you mean by "do it yourself"? Write code using FMA GCC intrinsics?
> Why I need to do something that can be automated by a compiler? Modern
> approach is to give a hint to the compiler instead of write specialised code
> for different architectures.
>
> It seems you have misunderstood me. I don't want to force compiler to use
> explicit instruction sets. Instead, I want to give a hint to a compiler,
> about what math _transformations_ are allowed. And this hints are
> architecture independent. A compiler may a may not use this hints to
> optimise code.

There are compiler switches for that.  Maybe there should be one
pragma to tweak these compiler switches on a per-function basis,
rather than separately named pragmas.  That way you tell the compiler
what you want, rather than it being part of the language logic to
understand what must be turned on/off internally.

First, assume the language knows nothing about what platform it's
running on, then use that as a basis for suggesting new pragmas that
should be supported everywhere.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Johannes Pfau via Digitalmars-d
Am Sat, 6 Aug 2016 02:29:50 -0700
schrieb Walter Bright :

> On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:
> > On Friday, 5 August 2016 at 20:53:42 UTC, Walter Bright wrote:
> >  
> >> I agree that the typical summation algorithm suffers from double
> >> rounding. But that's one algorithm. I would appreciate if you
> >> would review
> >> http://dlang.org/phobos/std_algorithm_iteration.html#sum to ensure
> >> it doesn't have this problem, and if it does, how we can fix it. 
> >
> > Phobos's sum is two different algorithms. Pairwise summation for
> > Random Access Ranges and Kahan summation for Input Ranges. Pairwise
> > summation does not require IEEE rounding, but Kahan summation
> > requires it.
> >
> > The problem with real world example is that it depends on
> > optimisation. For example, if all temporary values are rounded,
> > this is not a problem, and if all temporary values are not rounded
> > this is not a problem too. However if some of them rounded and
> > others are not, than this will break Kahan algorithm.
> >
> > Kahan is the shortest and one of the slowest (comparing with KBN
> > for example) summation algorithms. The true story about Kahan, that
> > we may have it in Phobos, but we can use pairwise summation for
> > Input Ranges without random access, and it will be faster then
> > Kahan. So we don't need Kahan for current API at all.
> >
> > Mir has both Kahan, which works with 32-bit DMD, and pairwise,
> > witch works with input ranges.
> >
> > Kahan, KBN, KB2, and Precise summations is always use `real` or
> > `Complex!real` internal values for 32 bit X86 target. The only
> > problem with Precise summation, if we need precise result in double
> > and use real for internal summation, then the last bit will be
> > wrong in the 50% of cases.
> >
> > Another good point about Mir's summation algorithms, that they are
> > Output Ranges. This means they can be used effectively to sum
> > multidimensional arrays for example. Also, Precise summator may be
> > used to compute exact sum of distributed data.
> >
> > When we get a decision and solution for rounding problem, I will
> > make PR for std.experimental.numeric.sum.
> >  
> >> I hear you. I'd like to explore ways of solving it. Got any
> >> ideas?  
> >
> > We need to take the overall picture.
> >
> > It is very important to recognise that D core team is small and D
> > community is not large enough now to involve a lot of new
> > professionals. This means that time of existing one engineers is
> > very important for D and the most important engineer for D is you,
> > Walter.
> >
> > In the same time we need to move forward fast with language changes
> > and druntime changes (GC-less Fibers for example).
> >
> > So, we need to choose tricky options for development. The most
> > important option for D in the science context is to split D
> > Programming Language from DMD in our minds. I am not asking to
> > remove DMD as reference compiler. Instead of that, we can introduce
> > changes in D that can not be optimally implemented in DMD (because
> > you have a lot of more important things to do for D instead of
> > optimisation) but will be awesome for our LLVM-based or GCC-based
> > backends.
> >
> > We need 2 new pragmas with the same syntax as `pragma(inline, xxx)`:
> >
> > 1. `pragma(fusedMath)` allows fused mul-add, mul-sub, div-add,
> > div-sub operations. 2. `pragma(fastMath)` equivalents to [1]. This
> > pragma can be used to allow extended precision.
> >
> > This should be 2 separate pragmas. The second one may assume the
> > first one.
> >
> > Recent LDC beta has @fastmath attribute for functions, and it is
> > already used in Phobos ndslice.algorithm PR and its Mir's mirror.
> > Attributes are alternative for pragmas, but their syntax should be
> > extended, see [2]
> >
> > The old approach is separate compilation, but it is weird, low
> > level for users, and requires significant efforts for both small
> > and large projects.
> >
> > [1] http://llvm.org/docs/LangRef.html#fast-math-flags
> > [2] https://github.com/ldc-developers/ldc/issues/1669  
> 
> Thanks for your help with this.
> 
> Using attributes for this is a mistake. Attributes affect the
> interface to a function

This is not true for UDAs. LDC and GDC actually implement @attribute
as an UDA. And UDAs used in serialization interfaces, the std.benchmark
proposals, ... do not affect the interface either.

> not its internal implementation.

It's possible to reflect on the UDAs of the current function, so this
is not true in general:
-
@(40) int foo()
{
mixin("alias thisFunc = " ~ __FUNCTION__ ~ ";");
return __traits(getAttributes, thisFunc)[0];
}
-
https://dpaste.dzfl.pl/aa0615b40adf

I think this restriction is also quite arbitrary. For end users
attributes provide a much nicer syntax than pragmas. Both GDC and LDC
already successfully use UDAs for function specific backend options, so
DMD is really the excepti

Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Ilya Yaroshenko via Digitalmars-d

On Saturday, 6 August 2016 at 11:10:18 UTC, Iain Buclaw wrote:
On 6 August 2016 at 12:07, Ilya Yaroshenko via Digitalmars-d 
 wrote:

On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:


On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d 
 wrote:


On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright 
wrote:


[...]




OK, then we need a third pragma,`pragma(ieeeRound)`. But
`pragma(fusedMath)`
and `pragma(fastMath)` should be presented too.


[...]




It allows a compiler to replace two arithmetic operations 
with single composed one, see AVX2 (FMA3 for intel and FMA4 
for AMD) instruction set.



No pragmas tied to a specific architecture should be allowed 
in the language spec, please.



Then probably Mir will drop all compilers, but LDC
LLVM is tied for real world, so we can tied D for real world 
too. If a
compiler can not implement optimization pragma, then this 
pragma can be just

ignored by the compiler.


If you need a function to work with an exclusive instruction 
set or something as specific as use of composed/fused 
instructions, then it is common to use an indirect function 
resolver to choose the most relevant implementation for the 
system that's running the code (a la @ifunc), then for the 
targetted fusedMath implementation, do it yourself.


What do you mean by "do it yourself"? Write code using FMA GCC 
intrinsics? Why I need to do something that can be automated by a 
compiler? Modern approach is to give a hint to the compiler 
instead of write specialised code for different architectures.


It seems you have misunderstood me. I don't want to force 
compiler to use explicit instruction sets. Instead, I want to 
give a hint to a compiler, about what math _transformations_ are 
allowed. And this hints are architecture independent. A compiler 
may a may not use this hints to optimise code.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Iain Buclaw via Digitalmars-d
On 6 August 2016 at 12:07, Ilya Yaroshenko via Digitalmars-d
 wrote:
> On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
>>
>> On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
>>  wrote:
>>>
>>> On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:

 [...]
>>>
>>>
>>>
>>> OK, then we need a third pragma,`pragma(ieeeRound)`. But
>>> `pragma(fusedMath)`
>>> and `pragma(fastMath)` should be presented too.
>>>
 [...]
>>>
>>>
>>>
>>> It allows a compiler to replace two arithmetic operations with single
>>> composed one, see AVX2 (FMA3 for intel and FMA4 for AMD) instruction set.
>>
>>
>> No pragmas tied to a specific architecture should be allowed in the
>> language spec, please.
>
>
> Then probably Mir will drop all compilers, but LDC
> LLVM is tied for real world, so we can tied D for real world too. If a
> compiler can not implement optimization pragma, then this pragma can be just
> ignored by the compiler.

If you need a function to work with an exclusive instruction set or
something as specific as use of composed/fused instructions, then it
is common to use an indirect function resolver to choose the most
relevant implementation for the system that's running the code (a la
@ifunc), then for the targetted fusedMath implementation, do it
yourself.


Re: D safety! New Feature?

2016-08-06 Thread Timon Gehr via Digitalmars-d

On 06.08.2016 07:56, ag0aep6g wrote:


Add parentheses to the typeof one and it fails as expected:

static assert(is(typeof(a.foo(; /* fails */

Can also do the function literal thing you did in the __traits one:

static assert(is(typeof(() { a.foo; }))); /* fails */


You can, but don't.

class A {
int foo() { return 1; }
}

void main() {
auto a = new A;
static void gotcha(){
// This passes:
static assert(is(typeof((){a.foo;})));
// This doesn't:
static assert(__traits(compiles,(){a.foo;}));
}
}



Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Ilya Yaroshenko via Digitalmars-d

On Saturday, 6 August 2016 at 10:02:25 UTC, Iain Buclaw wrote:
On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d 
 wrote:
On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright 
wrote:

[...]



OK, then we need a third pragma,`pragma(ieeeRound)`. But 
`pragma(fusedMath)`

and `pragma(fastMath)` should be presented too.


[...]



It allows a compiler to replace two arithmetic operations with 
single composed one, see AVX2 (FMA3 for intel and FMA4 for 
AMD) instruction set.


No pragmas tied to a specific architecture should be allowed in 
the language spec, please.


Then probably Mir will drop all compilers, but LDC
LLVM is tied for real world, so we can tied D for real world too. 
If a compiler can not implement optimization pragma, then this 
pragma can be just ignored by the compiler.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Iain Buclaw via Digitalmars-d
On 6 August 2016 at 11:48, Ilya Yaroshenko via Digitalmars-d
 wrote:
> On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:
>>
>> On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:
>>>
>>> We need 2 new pragmas with the same syntax as `pragma(inline, xxx)`:
>>>
>>> 1. `pragma(fusedMath)` allows fused mul-add, mul-sub, div-add, div-sub
>>> operations.
>>> 2. `pragma(fastMath)` equivalents to [1]. This pragma can be used to
>>> allow
>>> extended precision.
>>
>>
>>
>> The LDC fastmath bothers me a lot. It throws away proper NaN and infinity
>> handling, and throws away precision by allowing reciprocal and algebraic
>> transformations. As I've said before, correctness should be first, not
>> speed, and fastmath has nothing to do with this thread.
>
>
> OK, then we need a third pragma,`pragma(ieeeRound)`. But `pragma(fusedMath)`
> and `pragma(fastMath)` should be presented too.
>
>> I don't know what the point of fusedMath is.
>
>
> It allows a compiler to replace two arithmetic operations with single
> composed one, see AVX2 (FMA3 for intel and FMA4 for AMD) instruction set.

No pragmas tied to a specific architecture should be allowed in the
language spec, please.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Ilya Yaroshenko via Digitalmars-d

On Saturday, 6 August 2016 at 09:35:32 UTC, Walter Bright wrote:

On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:
We need 2 new pragmas with the same syntax as `pragma(inline, 
xxx)`:


1. `pragma(fusedMath)` allows fused mul-add, mul-sub, div-add, 
div-sub operations.
2. `pragma(fastMath)` equivalents to [1]. This pragma can be 
used to allow

extended precision.



The LDC fastmath bothers me a lot. It throws away proper NaN 
and infinity handling, and throws away precision by allowing 
reciprocal and algebraic transformations. As I've said before, 
correctness should be first, not speed, and fastmath has 
nothing to do with this thread.


OK, then we need a third pragma,`pragma(ieeeRound)`. But 
`pragma(fusedMath)` and `pragma(fastMath)` should be presented 
too.



I don't know what the point of fusedMath is.


It allows a compiler to replace two arithmetic operations with 
single composed one, see AVX2 (FMA3 for intel and FMA4 for AMD) 
instruction set.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Walter Bright via Digitalmars-d

On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:

We need 2 new pragmas with the same syntax as `pragma(inline, xxx)`:

1. `pragma(fusedMath)` allows fused mul-add, mul-sub, div-add, div-sub 
operations.
2. `pragma(fastMath)` equivalents to [1]. This pragma can be used to allow
extended precision.



The LDC fastmath bothers me a lot. It throws away proper NaN and infinity 
handling, and throws away precision by allowing reciprocal and algebraic 
transformations. As I've said before, correctness should be first, not speed, 
and fastmath has nothing to do with this thread.



I don't know what the point of fusedMath is.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Walter Bright via Digitalmars-d

On 8/6/2016 1:21 AM, Ilya Yaroshenko wrote:

On Friday, 5 August 2016 at 20:53:42 UTC, Walter Bright wrote:


I agree that the typical summation algorithm suffers from double rounding. But
that's one algorithm. I would appreciate if you would review
http://dlang.org/phobos/std_algorithm_iteration.html#sum to ensure it doesn't
have this problem, and if it does, how we can fix it.



Phobos's sum is two different algorithms. Pairwise summation for Random Access
Ranges and Kahan summation for Input Ranges. Pairwise summation does not require
IEEE rounding, but Kahan summation requires it.

The problem with real world example is that it depends on optimisation. For
example, if all temporary values are rounded, this is not a problem, and if all
temporary values are not rounded this is not a problem too. However if some of
them rounded and others are not, than this will break Kahan algorithm.

Kahan is the shortest and one of the slowest (comparing with KBN for example)
summation algorithms. The true story about Kahan, that we may have it in Phobos,
but we can use pairwise summation for Input Ranges without random access, and it
will be faster then Kahan. So we don't need Kahan for current API at all.

Mir has both Kahan, which works with 32-bit DMD, and pairwise, witch works with
input ranges.

Kahan, KBN, KB2, and Precise summations is always use `real` or `Complex!real`
internal values for 32 bit X86 target. The only problem with Precise summation,
if we need precise result in double and use real for internal summation, then
the last bit will be wrong in the 50% of cases.

Another good point about Mir's summation algorithms, that they are Output
Ranges. This means they can be used effectively to sum multidimensional arrays
for example. Also, Precise summator may be used to compute exact sum of
distributed data.

When we get a decision and solution for rounding problem, I will make PR for
std.experimental.numeric.sum.


I hear you. I'd like to explore ways of solving it. Got any ideas?


We need to take the overall picture.

It is very important to recognise that D core team is small and D community is
not large enough now to involve a lot of new professionals. This means that time
of existing one engineers is very important for D and the most important
engineer for D is you, Walter.

In the same time we need to move forward fast with language changes and druntime
changes (GC-less Fibers for example).

So, we need to choose tricky options for development. The most important option
for D in the science context is to split D Programming Language from DMD in our
minds. I am not asking to remove DMD as reference compiler. Instead of that, we
can introduce changes in D that can not be optimally implemented in DMD (because
you have a lot of more important things to do for D instead of optimisation) but
will be awesome for our LLVM-based or GCC-based backends.

We need 2 new pragmas with the same syntax as `pragma(inline, xxx)`:

1. `pragma(fusedMath)` allows fused mul-add, mul-sub, div-add, div-sub 
operations.
2. `pragma(fastMath)` equivalents to [1]. This pragma can be used to allow
extended precision.

This should be 2 separate pragmas. The second one may assume the first one.

Recent LDC beta has @fastmath attribute for functions, and it is already used in
Phobos ndslice.algorithm PR and its Mir's mirror. Attributes are alternative for
pragmas, but their syntax should be extended, see [2]

The old approach is separate compilation, but it is weird, low level for users,
and requires significant efforts for both small and large projects.

[1] http://llvm.org/docs/LangRef.html#fast-math-flags
[2] https://github.com/ldc-developers/ldc/issues/1669


Thanks for your help with this.

Using attributes for this is a mistake. Attributes affect the interface to a 
function, not its internal implementation. Pragmas are suitable for internal 
implementation things. I also oppose using compiler flags, because they tend to 
be overly global, and the details of an algorithm should not be split between 
the source code and the makefile.




Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Iain Buclaw via Digitalmars-d
On 4 August 2016 at 23:38, Seb via Digitalmars-d
 wrote:
> On Thursday, 4 August 2016 at 21:13:23 UTC, Iain Buclaw wrote:
>>
>> On 4 August 2016 at 01:00, Seb via Digitalmars-d
>>  wrote:
>>>
>>> To make matters worse std.math yields different results than
>>> compiler/assembly intrinsics - note that in this example import std.math.pow
>>> adds about 1K instructions to the output assembler, whereas llvm_powf boils
>>> down to the assembly powf. Of course the performance of powf is a lot
>>> better, I measured [3] that e.g. std.math.pow takes ~1.5x as long for both
>>> LDC and DMD. Of course if you need to run this very often, this cost isn't
>>> acceptable.
>>>
>>
>> This could be something specific to your architecture.  I get the same
>> result on from all versions of powf, and from GCC builtins too, regardless
>> of optimization tunings.
>
>
> I can reproduce this on DPaste (also x86_64).
>
> https://dpaste.dzfl.pl/c0ab5131b49d
>
> Behavior with a recent LDC build is similar (as annotated with the
> comments).

When testing the math functions, I chose not to compare results to
what C libraries, or CPU instructions return, but rather compared the
results to Wolfram, which I hope I'm correct in saying is a more
reliable and proven source of scientific maths than libm.

As of the time I ported all pure D (not IASM) implementations of math
functions, the results returned from all unittests using 80-bit reals
were identical with Wolfram given up to the last 2 digits as an
acceptable error with some values.  This was true for all except
inputs that were just inside the domain for the function, in which
case only double precision was guaranteed.  Where applicable, they
were also found to in some cases to be more accurate than the inline
assembler or yl2x implementations version paths that are used if you
compile with DMD or LDC.


Re: Why don't we switch to C like floating pointed arithmetic instead of automatic expansion to reals?

2016-08-06 Thread Ilya Yaroshenko via Digitalmars-d

On Friday, 5 August 2016 at 20:53:42 UTC, Walter Bright wrote:

I agree that the typical summation algorithm suffers from 
double rounding. But that's one algorithm. I would appreciate 
if you would review 
http://dlang.org/phobos/std_algorithm_iteration.html#sum to 
ensure it doesn't have this problem, and if it does, how we can 
fix it.




Phobos's sum is two different algorithms. Pairwise summation for 
Random Access Ranges and Kahan summation for Input Ranges. 
Pairwise summation does not require IEEE rounding, but Kahan 
summation requires it.


The problem with real world example is that it depends on 
optimisation. For example, if all temporary values are rounded, 
this is not a problem, and if all temporary values are not 
rounded this is not a problem too. However if some of them 
rounded and others are not, than this will break Kahan algorithm.


Kahan is the shortest and one of the slowest (comparing with KBN 
for example) summation algorithms. The true story about Kahan, 
that we may have it in Phobos, but we can use pairwise summation 
for Input Ranges without random access, and it will be faster 
then Kahan. So we don't need Kahan for current API at all.


Mir has both Kahan, which works with 32-bit DMD, and pairwise, 
witch works with input ranges.


Kahan, KBN, KB2, and Precise summations is always use `real` or 
`Complex!real` internal values for 32 bit X86 target. The only 
problem with Precise summation, if we need precise result in 
double and use real for internal summation, then the last bit 
will be wrong in the 50% of cases.


Another good point about Mir's summation algorithms, that they 
are Output Ranges. This means they can be used effectively to sum 
multidimensional arrays for example. Also, Precise summator may 
be used to compute exact sum of distributed data.


When we get a decision and solution for rounding problem, I will 
make PR for std.experimental.numeric.sum.


I hear you. I'd like to explore ways of solving it. Got any 
ideas?


We need to take the overall picture.

It is very important to recognise that D core team is small and D 
community is not large enough now to involve a lot of new 
professionals. This means that time of existing one engineers is 
very important for D and the most important engineer for D is 
you, Walter.


In the same time we need to move forward fast with language 
changes and druntime changes (GC-less Fibers for example).


So, we need to choose tricky options for development. The most 
important option for D in the science context is to split D 
Programming Language from DMD in our minds. I am not asking to 
remove DMD as reference compiler. Instead of that, we can 
introduce changes in D that can not be optimally implemented in 
DMD (because you have a lot of more important things to do for D 
instead of optimisation) but will be awesome for our LLVM-based 
or GCC-based backends.


We need 2 new pragmas with the same syntax as `pragma(inline, 
xxx)`:


1. `pragma(fusedMath)` allows fused mul-add, mul-sub, div-add, 
div-sub operations.
2. `pragma(fastMath)` equivalents to [1]. This pragma can be used 
to allow extended precision.


This should be 2 separate pragmas. The second one may assume the 
first one.


Recent LDC beta has @fastmath attribute for functions, and it is 
already used in Phobos ndslice.algorithm PR and its Mir's mirror. 
Attributes are alternative for pragmas, but their syntax should 
be extended, see [2]


The old approach is separate compilation, but it is weird, low 
level for users, and requires significant efforts for both small 
and large projects.


[1] http://llvm.org/docs/LangRef.html#fast-math-flags
[2] https://github.com/ldc-developers/ldc/issues/1669

Best regards,
Ilya



Re: For the Love of God, Please Write Better Docs!

2016-08-06 Thread poliklosio via Digitalmars-d

On Friday, 5 August 2016 at 21:01:28 UTC, H.Loom wrote:

On Friday, 5 August 2016 at 19:52:19 UTC, poliklosio wrote:

On Tuesday, 2 August 2016 at 20:26:06 UTC, Jack Stouffer wrote:

(...)

(...)
In my opinion open source community massively underestimates 
the importance of high-level examples, articles and tutorials. 
(...)


This is not true for widely used libraries. I can find 
everywhere examples of how to use FreeType even if the bindings 
I use have 0 docs and 0 examples. Idem for libX11...Also i have 
to say that with a well crafted library you understand what a 
function does when it's well named, when the parameters are 
well named.


This is another story for marginal libraries, e.g when you're 
part of the early adopters.


I think we agree here. Most libraries are marginal, and missing 
proper announcement and documentation is the main reason they are 
marginal. Hence, this is true for most libraries.
Of course there are good ones, sadly not many D libraries are 
really well documented.


If you are a library author and you are reading this, let me 
quantify this for you.


Thanks you ! you're so generous.


Why the sarcasm? I was just venting (at no-one in particular) 
after hitting the wall for a large chunk of my life.



(...)

This also explains part of complaints on Phobos documentation 
- people don't get the general idea of how to make things work 
together.


For phobos i agree. D examples shipped with DMD are ridiculous. 
I was thinking to propose an initiative which would be to renew 
completly them with small usable and idiomatic programs.


Those who do get the general idea don't care much about the 
exact width of whitespace and other similar concerns.


I don't understand your private thing here. Are you talking 
about text justification in ddoc ? If it's a mono font no 
problem here...


I'm lost here. The "width of whitespace" was just an example of 
something you would NOT normally care about if you were a savvy 
user used who already knows how to navigate the docs.