Any usable SIMD implementation?

2016-03-31 Thread Martin Nowak via Digitalmars-d
I'm currently working on a templated arrayop implementation (using RPN to encode ASTs). So far things worked out great, but now I got stuck b/c apparently none of the D compilers has a working SIMD implementation (maybe GDC has but it's very difficult to work w/ the 2.066 frontend). https://github

Re: Any usable SIMD implementation?

2016-03-31 Thread ZombineDev via Digitalmars-d
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: I'm currently working on a templated arrayop implementation (using RPN to encode ASTs). So far things worked out great, but now I got stuck b/c apparently none of the D compilers has a working SIMD implementation (maybe GDC has bu

Re: Any usable SIMD implementation?

2016-03-31 Thread John Colvin via Digitalmars-d
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: I'm currently working on a templated arrayop implementation (using RPN to encode ASTs). So far things worked out great, but now I got stuck b/c apparently none of the D compilers has a working SIMD implementation (maybe GDC has bu

Re: Any usable SIMD implementation?

2016-03-31 Thread Johan Engelen via Digitalmars-d
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: I don't want to do anything fancy, just unaligned loads, stores, and integral mul/div. Is this really the current state of SIMD or am I missing sth.? I think you want to write your code using SIMD primitives. But in case you wan

Re: Any usable SIMD implementation?

2016-03-31 Thread Iakh via Digitalmars-d
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: I'm currently working on a templated arrayop implementation (using RPN to encode ASTs). So far things worked out great, but now I got stuck b/c apparently none of the D compilers has a working SIMD implementation (maybe GDC has bu

Re: Any usable SIMD implementation?

2016-04-01 Thread Martin Nowak via Digitalmars-d
On 03/31/2016 10:55 AM, ZombineDev wrote: > [2]: https://github.com/D-Programming-Language/phobos/pull/2862 Well apparently stores w/ dmd's weird core.simd interface don't work, or I can't figure out (from the non-existent documentation) how to use it. --- import core.simd; void test(float4* ptr

Re: Any usable SIMD implementation?

2016-04-01 Thread Iain Buclaw via Digitalmars-d
On 2 Apr 2016 12:40 am, "Martin Nowak via Digitalmars-d" < digitalmars-d@puremagic.com> wrote: > > On 03/31/2016 10:55 AM, ZombineDev wrote: > > [2]: https://github.com/D-Programming-Language/phobos/pull/2862 > > Well apparently stores w/ dmd's weird core.simd interface don't work, or > I can't fig

Re: Any usable SIMD implementation?

2016-04-02 Thread Martin Nowak via Digitalmars-d
On Saturday, 2 April 2016 at 06:13:24 UTC, Iain Buclaw wrote: I would just let the compiler optimize / vectorize the operation, but then again that it is probably just me who thinks these things. It's intended to replace the array ops in druntime, relying on vecorizers won't suffice, e.g. you

Re: Any usable SIMD implementation?

2016-04-02 Thread Iain Buclaw via Digitalmars-d
On 2 Apr 2016 9:45 am, "Martin Nowak via Digitalmars-d" < digitalmars-d@puremagic.com> wrote: > > On Saturday, 2 April 2016 at 06:13:24 UTC, Iain Buclaw wrote: >> >> I would just let the compiler optimize / vectorize the operation, but then again that it is probably just me who thinks these things.

Re: Any usable SIMD implementation?

2016-04-02 Thread Martin Nowak via Digitalmars-d
On 04/02/2016 10:19 AM, Iain Buclaw via Digitalmars-d wrote: >> > __builtin_ia32_loadups >> > __builtin_ia32_storeups > Any agnostic way to... :-) I'm already using vector types for most operations, so it's somewhat portable. But for whatever reason D doesn't allow multiplication/division w/ integ

Re: Any usable SIMD implementation?

2016-04-02 Thread 9il via Digitalmars-d
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: I'm currently working on a templated arrayop implementation (using RPN to encode ASTs). So far things worked out great, but now I got stuck b/c apparently none of the D compilers has a working SIMD implementation (maybe GDC has bu

Re: Any usable SIMD implementation?

2016-04-02 Thread Iain Buclaw via Digitalmars-d
On 3 Apr 2016 8:15 am, "9il via Digitalmars-d" wrote: > > On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: >> >> I'm currently working on a templated arrayop implementation (using RPN >> to encode ASTs). >> So far things worked out great, but now I got stuck b/c apparently none >> o

Re: Any usable SIMD implementation?

2016-04-03 Thread Manu via Digitalmars-d
On 3 April 2016 at 16:14, 9il via Digitalmars-d wrote: > On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: >> >> I'm currently working on a templated arrayop implementation (using RPN >> to encode ASTs). >> So far things worked out great, but now I got stuck b/c apparently none >> of

Re: Any usable SIMD implementation?

2016-04-03 Thread Johan Engelen via Digitalmars-d
On Friday, 1 April 2016 at 22:31:00 UTC, Martin Nowak wrote: LDC at least has some intrinsics once you find ldc.gccbuiltins_x86, but for some reason comes with it's own broken ldc.simd.loadUnaligned Please submit a GH issue with LDC, thanks! -Johan

Re: Any usable SIMD implementation?

2016-04-03 Thread Martin Nowak via Digitalmars-d
On Friday, 1 April 2016 at 22:31:00 UTC, Martin Nowak wrote: Well apparently stores w/ dmd's weird core.simd interface don't work, or I can't figure out (from the non-existent documentation) how to use it. https://github.com/D-Programming-Language/dmd/pull/5625

Re: Any usable SIMD implementation?

2016-04-03 Thread Walter Bright via Digitalmars-d
On 4/3/2016 12:39 AM, Manu via Digitalmars-d wrote: My SIMD implementation has been blocked on that for years too. First I've heard of that. I need to know the SIMD level flags passed to the compiler at least, and DMD needs to introduce the concept. Here is a list of all the open Bugzilla

Re: Any usable SIMD implementation?

2016-04-03 Thread Jack Stouffer via Digitalmars-d
On Sunday, 3 April 2016 at 22:00:51 UTC, Walter Bright wrote: I need to know the SIMD level flags passed to the compiler at least, and DMD needs to introduce the concept. Here is a list of all the open Bugzilla issues tagged with the keyword SIMD: https://issues.dlang.org/buglist.cgi?bug_st

Re: Any usable SIMD implementation?

2016-04-03 Thread Walter Bright via Digitalmars-d
On 4/3/2016 7:12 PM, Jack Stouffer wrote: On Sunday, 3 April 2016 at 22:00:51 UTC, Walter Bright wrote: There is no issue I can find about being blocked for years on SIMD flags. I guarantee you that if you never report the problems you're having, you will suffer in silence and they will not get

Re: Any usable SIMD implementation?

2016-04-04 Thread Marco Leise via Digitalmars-d
Am Sun, 03 Apr 2016 06:14:23 + schrieb 9il : > Hello Martin, > > Is it possible to introduce compile time information about target > platform? I am working on BLAS from scratch implementation. And > it is no hope to create something useable without CT information > about target. > > Best

Re: Any usable SIMD implementation?

2016-04-04 Thread 9il via Digitalmars-d
On Sunday, 3 April 2016 at 06:33:13 UTC, Iain Buclaw wrote: On 3 Apr 2016 8:15 am, "9il via Digitalmars-d" wrote: Hello Martin, Is it possible to introduce compile time information about target platform? I am working on BLAS from scratch implementation. And it is no hope to create somethin

Re: Any usable SIMD implementation?

2016-04-04 Thread Marco Leise via Digitalmars-d
Am Mon, 04 Apr 2016 14:02:03 + schrieb 9il : > Target cpu configuration: > - CPU architecture (done) > - Count of FP/Integer registers > - Allowed sets of instructions: for example, AVX2, FMA4 > - Compiler optimization options (for math) > > Ilya - On amd64, whether floating-point math is ha

Re: Any usable SIMD implementation?

2016-04-04 Thread Jack Stouffer via Digitalmars-d
On Sunday, 3 April 2016 at 07:39:00 UTC, Manu wrote: My SIMD implementation has been blocked on that for years too. I need to know the SIMD level flags passed to the compiler at least, and DMD needs to introduce the concept. I made a bug to track this problem: https://issues.dlang.org/show_b

Re: Any usable SIMD implementation?

2016-04-04 Thread jmh530 via Digitalmars-d
On Monday, 4 April 2016 at 17:23:49 UTC, Jack Stouffer wrote: I made a bug to track this problem: https://issues.dlang.org/show_bug.cgi?id=15873 You might add link to this thread and github where he made the original comment..

Re: Any usable SIMD implementation?

2016-04-04 Thread 9il via Digitalmars-d
On Monday, 4 April 2016 at 16:21:15 UTC, Marco Leise wrote: Am Mon, 04 Apr 2016 14:02:03 + schrieb 9il : - On amd64, whether floating-point math is handled by the FPU or SSE. When emulating floating-point, e.g. for float-to-string and string-to-float code, it is useful to know where to

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 10:27 AM, jmh530 wrote: On Monday, 4 April 2016 at 17:23:49 UTC, Jack Stouffer wrote: I made a bug to track this problem: https://issues.dlang.org/show_bug.cgi?id=15873 You might add link to this thread and github where he made the original comment.. http://www.digitalmars.com/

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 9:21 AM, Marco Leise wrote: To put this to good use, we need a reliable way - basically a global variable - to check for SSE4 (or POPCNT, etc.). What we have now does not work across all compilers. http://dlang.org/phobos/core_cpuid.html

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 10:23 AM, Jack Stouffer wrote: On Sunday, 3 April 2016 at 07:39:00 UTC, Manu wrote: My SIMD implementation has been blocked on that for years too. I need to know the SIMD level flags passed to the compiler at least, and DMD needs to introduce the concept. I made a bug to track this

Re: Any usable SIMD implementation?

2016-04-04 Thread ZombineDev via Digitalmars-d
On Monday, 4 April 2016 at 19:43:43 UTC, Walter Bright wrote: On 4/4/2016 10:23 AM, Jack Stouffer wrote: On Sunday, 3 April 2016 at 07:39:00 UTC, Manu wrote: My SIMD implementation has been blocked on that for years too. I need to know the SIMD level flags passed to the compiler at least, and

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 12:55 PM, ZombineDev wrote: I believe the issue is fixed (for DMD) with a documentation improvement. I believe the problem is that you can't rely on D_SIMD that SSE4, FMA, AVX2, AVX-512, etc. are available on the target platform. See also http://forum.dlang.org/post/fnrmgfvqmykttsuux

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 7:02 AM, 9il wrote: What kind of information? Target cpu configuration: - CPU architecture (done) Done. - Count of FP/Integer registers ?? - Allowed sets of instructions: for example, AVX2, FMA4 Done. D_SIMD - Compiler optimization options (for math) Moot. DMD does not

Re: Any usable SIMD implementation?

2016-04-04 Thread 9il via Digitalmars-d
On Monday, 4 April 2016 at 20:29:11 UTC, Walter Bright wrote: On 4/4/2016 7:02 AM, 9il wrote: What kind of information? Target cpu configuration: - CPU architecture (done) Done. - Count of FP/Integer registers ?? How many general purpose registers, SIMD Floating Point registers, SIMD

Re: Any usable SIMD implementation?

2016-04-04 Thread jmh530 via Digitalmars-d
On Monday, 4 April 2016 at 20:29:11 UTC, Walter Bright wrote: - Allowed sets of instructions: for example, AVX2, FMA4 Done. D_SIMD I'm not a SIMD expert, I've only played around with SIMD a little, but this confuses me. version(D_SIMD) will tell you when SIMD is implemented, but not what

Re: Any usable SIMD implementation?

2016-04-04 Thread jmh530 via Digitalmars-d
On Monday, 4 April 2016 at 21:05:44 UTC, 9il wrote: OpenBLAS kernels is 30 MB of assembler code! So we would be able to replace it once and for a very long time with Phobos. Are you familiar with this project at all? https://github.com/flame/blis

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 2:11 PM, jmh530 wrote: version(D_SIMD) will tell you when SIMD is implemented, but not what type of SIMD. The first SIMD level. For instance, if I am on a machine that can use AVX2 instructions, then code in a version(D_SIMD) block will execute, but it should also execute if the p

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 2:05 PM, 9il wrote: - Count of FP/Integer registers ?? How many general purpose registers, SIMD Floating Point registers, SIMD Integer registers have a CPU? These are deducible from X86, X86_64, and SIMD version identifiers. Needs to know is it AVX or AVX2 in compile time Sin

Re: Any usable SIMD implementation?

2016-04-04 Thread 9il via Digitalmars-d
On Monday, 4 April 2016 at 22:34:06 UTC, Walter Bright wrote: On 4/4/2016 2:05 PM, 9il wrote: - Count of FP/Integer registers ?? How many general purpose registers, SIMD Floating Point registers, SIMD Integer registers have a CPU? These are deducible from X86, X86_64, and SIMD version iden

Re: Any usable SIMD implementation?

2016-04-04 Thread 9il via Digitalmars-d
On Monday, 4 April 2016 at 21:13:30 UTC, jmh530 wrote: On Monday, 4 April 2016 at 21:05:44 UTC, 9il wrote: OpenBLAS kernels is 30 MB of assembler code! So we would be able to replace it once and for a very long time with Phobos. Are you familiar with this project at all? https://github.com/f

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/4/2016 11:10 PM, 9il wrote: It is impossible to deduct from that combination that Xeon Phi has 32 FP registers. Since dmd doesn't generate specific code for a Xeon Phi, having a compile time switch for it is meaningless. "Since the compiler never generates AVX or AVX2" - this is defi

Re: Any usable SIMD implementation?

2016-04-05 Thread John Colvin via Digitalmars-d
On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: On 4/4/2016 11:10 PM, 9il wrote: It is impossible to deduct from that combination that Xeon Phi has 32 FP registers. Since dmd doesn't generate specific code for a Xeon Phi, having a compile time switch for it is meaningless. "

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: On 4/4/2016 11:10 PM, 9il wrote: I still don't understand why you cannot just set '-version=xxx' on the command line and then switch off that version in your custom code. I can do it, however I would like to get this information f

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/5/2016 2:03 AM, John Colvin wrote: So you're suggesting that libraries invent their own list of versions for specific architectures / CPU features, which the user then has to specify somehow on the command line? I want to be able to write code that uses standardised versions that work across

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/5/2016 2:39 AM, 9il wrote: On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: On 4/4/2016 11:10 PM, 9il wrote: I still don't understand why you cannot just set '-version=xxx' on the command line and then switch off that version in your custom code. I can do it, however I would

Re: Any usable SIMD implementation?

2016-04-05 Thread Johan Engelen via Digitalmars-d
On Tuesday, 5 April 2016 at 09:39:21 UTC, 9il wrote: 3. This is possible and not very hard to implement if I am not wrong. Last time I looked into this (related to implementing @target, see [1]), I only found some Clang code dealing with this, but now I found LLVM functions about architectu

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 10:30:19 UTC, Walter Bright wrote: On 4/5/2016 2:39 AM, 9il wrote: On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: 1. This would help to eliminate configuration bugs. 2. This would reduce work for users and simplified user experience. 3. This is possib

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 10:27:46 UTC, Walter Bright wrote: On 4/5/2016 2:03 AM, John Colvin wrote: There's a line between trying to standardize everything and letting add-on libraries be free to innovate. Besides, I think it's a poor design to customize the app for only one SIMD type. A b

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/5/2016 4:07 AM, 9il wrote: On Tuesday, 5 April 2016 at 10:30:19 UTC, Walter Bright wrote: On 4/5/2016 2:39 AM, 9il wrote: On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: 1. This would help to eliminate configuration bugs. 2. This would reduce work for users and simplified us

Re: Any usable SIMD implementation?

2016-04-05 Thread Johan Engelen via Digitalmars-d
On Tuesday, 5 April 2016 at 21:29:41 UTC, Walter Bright wrote: I want to make it clear that dmd does not generate AFX specific code, has no switch to enable AFX code generation and has no basis for setting predefined version identifiers for it. How about adding a "__target(...)" compile-time

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/5/2016 4:17 AM, 9il wrote: What wrong for scientist to write `-mcpu=native`? Because it would affect all the code in the module and every template it imports, which is a problem if you are using 'static if' and want to compile different pieces with different settings.

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Wednesday, 6 April 2016 at 00:45:54 UTC, Walter Bright wrote: On 4/5/2016 4:17 AM, 9il wrote: What wrong for scientist to write `-mcpu=native`? Because it would affect all the code in the module and every template it imports, which is a problem if you are using 'static if' and want to com

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 21:29:41 UTC, Walter Bright wrote: On 4/5/2016 4:07 AM, 9il wrote: On Tuesday, 5 April 2016 at 10:30:19 UTC, Walter Bright wrote: On 4/5/2016 2:39 AM, 9il wrote: On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: 1. This would help to eliminate configur

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 21:41:46 UTC, Johan Engelen wrote: On Tuesday, 5 April 2016 at 21:29:41 UTC, Walter Bright wrote: I want to make it clear that dmd does not generate AFX specific code, has no switch to enable AFX code generation and has no basis for setting predefined version ident

Re: Any usable SIMD implementation?

2016-04-06 Thread jmh530 via Digitalmars-d
On Wednesday, 6 April 2016 at 06:11:15 UTC, 9il wrote: Yes, only few of us would use this feature directly, however, many of us would use this under-the-hood in BLAS/SIMD oriented part of Phobos. Especially since everyone says to use LDC for the fastest code anyway...

Re: Any usable SIMD implementation?

2016-04-06 Thread Manu via Digitalmars-d
On 5 April 2016 at 20:30, Walter Bright via Digitalmars-d wrote: > On 4/5/2016 2:39 AM, 9il wrote: >> >> On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: >>> >>> On 4/4/2016 11:10 PM, 9il wrote: >>> I still don't understand why you cannot just set '-version=xxx' on the >>> command >>

Re: Any usable SIMD implementation?

2016-04-06 Thread Manu via Digitalmars-d
On 6 April 2016 at 07:41, Johan Engelen via Digitalmars-d wrote: > On Tuesday, 5 April 2016 at 21:29:41 UTC, Walter Bright wrote: >> >> >> I want to make it clear that dmd does not generate AFX specific code, has >> no switch to enable AFX code generation and has no basis for setting >> predefined

Re: Any usable SIMD implementation?

2016-04-06 Thread 9il via Digitalmars-d
On Wednesday, 6 April 2016 at 12:40:04 UTC, Manu wrote: On 6 April 2016 at 07:41, Johan Engelen via Digitalmars-d wrote: [...] With respect to SIMD, knowing a processor model like 'broadwell' is not helpful, since we really want to know 'sse4'. If we know processor model, then we need to ke

Re: Any usable SIMD implementation?

2016-04-06 Thread Johan Engelen via Digitalmars-d
On Wednesday, 6 April 2016 at 13:26:51 UTC, 9il wrote: On Wednesday, 6 April 2016 at 12:40:04 UTC, Manu wrote: On 6 April 2016 at 07:41, Johan Engelen via Digitalmars-d wrote: [...] With respect to SIMD, knowing a processor model like 'broadwell' is not helpful, since we really want to know

Re: Any usable SIMD implementation?

2016-04-06 Thread 9il via Digitalmars-d
On Wednesday, 6 April 2016 at 14:31:58 UTC, Johan Engelen wrote: Probably the most difficult part is defining an API. Ilya made a start here: http://forum.dlang.org/post/eodutgruoofruperr...@forum.dlang.org (but he doesn't like his earlier API "bool a = __target("broadwell")" any more ;-P , I a

Re: Any usable SIMD implementation?

2016-04-06 Thread Walter Bright via Digitalmars-d
On 4/6/2016 5:36 AM, Manu via Digitalmars-d wrote: But at very least, the important detail is that the version ID's are standardised and shared among all compilers. It's a reasonable suggestion; some points: 1. This has been characterized as a blocker, it is not, as it does not impede writing

Re: Any usable SIMD implementation?

2016-04-06 Thread Manu via Digitalmars-d
On 6 April 2016 at 23:26, 9il via Digitalmars-d wrote: > On Wednesday, 6 April 2016 at 12:40:04 UTC, Manu wrote: >> >> On 6 April 2016 at 07:41, Johan Engelen via Digitalmars-d >> wrote: >>> >>> [...] >> >> >> With respect to SIMD, knowing a processor model like 'broadwell' is not >> helpful, sin

Re: Any usable SIMD implementation?

2016-04-06 Thread Manu via Digitalmars-d
On 7 April 2016 at 10:42, Walter Bright via Digitalmars-d wrote: > On 4/6/2016 5:36 AM, Manu via Digitalmars-d wrote: >> >> But at very least, the important detail is that the version ID's are >> standardised and shared among all compilers. > > > It's a reasonable suggestion; some points: > > 1. T

Re: Any usable SIMD implementation?

2016-04-06 Thread Walter Bright via Digitalmars-d
On 4/6/2016 7:43 PM, Manu via Digitalmars-d wrote: 1. This has been characterized as a blocker, it is not, as it does not impede writing code that takes advantage of various SIMD code generation at compile time. It's sufficiently blocking that I have not felt like working any further without th

Re: Any usable SIMD implementation?

2016-04-06 Thread Walter Bright via Digitalmars-d
On 4/6/2016 7:25 PM, Manu via Digitalmars-d wrote: Sure, but it's an ongoing maintenance task, constantly requiring population with metadata for new processors that become available. Remember, most processors are arm processors, and there are like 20 manufacturers of arm chips, and many of those

Re: Any usable SIMD implementation?

2016-04-07 Thread 9il via Digitalmars-d
On Thursday, 7 April 2016 at 03:27:31 UTC, Walter Bright wrote: I can understand that it might be demotivating for you, but that is not a blocker. A blocker has no reasonable workaround. This has a trivial workaround: gdc -simd=AFX foo.d becomes: gdc -simd=AFX -version=AFX foo.d It'

Re: Any usable SIMD implementation?

2016-04-07 Thread Walter Bright via Digitalmars-d
On 4/7/2016 12:59 AM, 9il wrote: 1. Executable size will grow with every instruction set release Yes, and nobody cares. With virtual memory and demand loading, unexecuted code will never be loaded off of disk and will never consume memory space. And with a 64 bit address space, there will nev

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Thu, 7 Apr 2016 12:25:03 +1000 schrieb Manu via Digitalmars-d : > On 6 April 2016 at 23:26, 9il via Digitalmars-d > wrote: > > On Wednesday, 6 April 2016 at 12:40:04 UTC, Manu wrote: > >> > >> On 6 April 2016 at 07:41, Johan Engelen via Digitalmars-d > >> wrote: > >>> > >>> [...] > >> >

Re: Any usable SIMD implementation?

2016-04-07 Thread 9il via Digitalmars-d
On Thursday, 7 April 2016 at 09:41:06 UTC, Walter Bright wrote: On 4/7/2016 12:59 AM, 9il wrote: 1. Executable size will grow with every instruction set release Yes, and nobody cares. With virtual memory and demand loading, unexecuted code will never be loaded off of disk and will never cons

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Wed, 6 Apr 2016 17:42:30 -0700 schrieb Walter Bright : > On 4/6/2016 5:36 AM, Manu via Digitalmars-d wrote: > > But at very least, the important detail is that the version ID's are > > standardised and shared among all compilers. > > It's a reasonable suggestion; some points: > > 1. This ha

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Wed, 6 Apr 2016 20:27:31 -0700 schrieb Walter Bright : > On 4/6/2016 7:43 PM, Manu via Digitalmars-d wrote: > >> 1. This has been characterized as a blocker, it is not, as it does > >> not impede writing code that takes advantage of various SIMD code > >> generation at compile time. > > > > I

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Thu, 7 Apr 2016 02:41:06 -0700 schrieb Walter Bright : > > 3. This would not solve the problem for generic BLAS implementation > > for Phobos at all! How you would force compiler to USE and NOT USE > > specific vector permutations for example in the same object file? > > Yes, I know, DMD has no

Re: Any usable SIMD implementation?

2016-04-07 Thread Kai Nacke via Digitalmars-d
On Thursday, 7 April 2016 at 03:27:31 UTC, Walter Bright wrote: Then, void app(int simd)() { ... my fabulous app ... } int main() { auto fpu = core.cpuid.getfpu(); switch (fpu) { case SIMD: app!(SIMD)(); break; case SIMD4: app!(SIMD4)(); break; defaul

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Thu, 07 Apr 2016 10:52:42 + schrieb Kai Nacke : > On Thursday, 7 April 2016 at 03:27:31 UTC, Walter Bright wrote: > > Then, > > > > void app(int simd)() { ... my fabulous app ... } > > > > int main() { > > auto fpu = core.cpuid.getfpu(); > > switch (fpu) { > > ca

Re: Any usable SIMD implementation?

2016-04-07 Thread jmh530 via Digitalmars-d
On Thursday, 7 April 2016 at 10:03:50 UTC, 9il wrote: This is not true for BLAS based on D. Perhaps if you provide him a simplified example he might see what you're talking about?

Re: Any usable SIMD implementation?

2016-04-07 Thread 9il via Digitalmars-d
On Thursday, 7 April 2016 at 12:35:51 UTC, jmh530 wrote: On Thursday, 7 April 2016 at 10:03:50 UTC, 9il wrote: This is not true for BLAS based on D. Perhaps if you provide him a simplified example he might see what you're talking about? He know what I am talking about. This is about archi

Re: Any usable SIMD implementation?

2016-04-07 Thread Johan Engelen via Digitalmars-d
On Thursday, 7 April 2016 at 11:25:47 UTC, Johannes Pfau wrote: Am Thu, 07 Apr 2016 10:52:42 + schrieb Kai Nacke : glibc has a special mechanism for resolving the called function during loading. See the section on the GNU Indirect Function Mechanism here: https://www.ibm.com/developerwork

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Thu, 07 Apr 2016 13:27:05 + schrieb Johan Engelen : > On Thursday, 7 April 2016 at 11:25:47 UTC, Johannes Pfau wrote: > > Am Thu, 07 Apr 2016 10:52:42 + > > schrieb Kai Nacke : > > > >> glibc has a special mechanism for resolving the called > >> function during loading. See the secti

Re: Any usable SIMD implementation?

2016-04-07 Thread Johan Engelen via Digitalmars-d
On Thursday, 7 April 2016 at 14:46:06 UTC, Johannes Pfau wrote: Am Thu, 07 Apr 2016 13:27:05 + schrieb Johan Engelen : On Thursday, 7 April 2016 at 11:25:47 UTC, Johannes Pfau wrote: > Am Thu, 07 Apr 2016 10:52:42 + > schrieb Kai Nacke : > >> glibc has a special mechanism for resolving

Re: Any usable SIMD implementation?

2016-04-07 Thread Manu via Digitalmars-d
On 7 April 2016 at 13:27, Walter Bright via Digitalmars-d wrote: > On 4/6/2016 7:43 PM, Manu via Digitalmars-d wrote: >>> >>> 1. This has been characterized as a blocker, it is not, as it does not >>> impede writing code that takes advantage of various SIMD code generation >>> at >>> compile time.

Re: Any usable SIMD implementation?

2016-04-07 Thread Walter Bright via Digitalmars-d
On 4/7/2016 3:52 AM, Kai Nacke wrote: On Thursday, 7 April 2016 at 03:27:31 UTC, Walter Bright wrote: Then, void app(int simd)() { ... my fabulous app ... } int main() { auto fpu = core.cpuid.getfpu(); switch (fpu) { case SIMD: app!(SIMD)(); break; case SIMD

Re: Any usable SIMD implementation?

2016-04-07 Thread Walter Bright via Digitalmars-d
On 4/7/2016 5:27 PM, Manu via Digitalmars-d wrote: You'll have noticed that C++ interaction is my recent focus, since that's directly related to my current day-job, and the path that I need to solve now to get D into my work. We recognize C++ interoperability to be a key feature of D. I hope yo

Re: Any usable SIMD implementation?

2016-04-07 Thread Walter Bright via Digitalmars-d
On 4/7/2016 3:15 AM, Johannes Pfau wrote: The problem is that march=x can set more than one feature flag. So instead of gdc -march=armv7-a you have to do gdc -march=armv7-a -fversion=ARM_FEATURE_CRC32 -fversion=ARM_FEATURE_UNALIGNED ... Sou have to know exactly which features are supported for

Re: Any usable SIMD implementation?

2016-04-11 Thread Marco Leise via Digitalmars-d
Am Mon, 04 Apr 2016 18:35:26 + schrieb 9il : > @attribute("target", "+sse4")) would not work well for BLAS. BLAS > needs compile time constants. This is very important because BLAS > can be 95% portable, so I just need to write a code that would be > optimized very well by compiler. --Ilya

Re: Any usable SIMD implementation?

2016-04-11 Thread Marco Leise via Digitalmars-d
Am Mon, 4 Apr 2016 11:43:58 -0700 schrieb Walter Bright : > On 4/4/2016 9:21 AM, Marco Leise wrote: > >To put this to good use, we need a reliable way - basically > >a global variable - to check for SSE4 (or POPCNT, etc.). What > >we have now does not work across all compilers. > >

Re: Any usable SIMD implementation?

2016-04-11 Thread Marco Leise via Digitalmars-d
Am Mon, 4 Apr 2016 13:29:11 -0700 schrieb Walter Bright : > On 4/4/2016 7:02 AM, 9il wrote: > >> What kind of information? > > > > Target cpu configuration: > > - CPU architecture (done) > > Done. > > > - Count of FP/Integer registers > > ?? > > > - Allowed sets of instructions: for exam

Re: Any usable SIMD implementation?

2016-04-11 Thread Marco Leise via Digitalmars-d
Am Wed, 6 Apr 2016 20:29:21 -0700 schrieb Walter Bright : > On 4/6/2016 7:25 PM, Manu via Digitalmars-d wrote: > > TL;DR, defining architectures with an intel-centric naming convention > > is a very bad idea. > > You're not making a good case for a standard language defined set of > definition

Re: Any usable SIMD implementation?

2016-04-11 Thread Walter Bright via Digitalmars-d
On 4/11/2016 7:24 AM, Marco Leise wrote: Am Mon, 4 Apr 2016 11:43:58 -0700 schrieb Walter Bright : On 4/4/2016 9:21 AM, Marco Leise wrote: To put this to good use, we need a reliable way - basically a global variable - to check for SSE4 (or POPCNT, etc.). What we have now does not

Re: Any usable SIMD implementation?

2016-04-12 Thread xenon325 via Digitalmars-d
On Thursday, 7 April 2016 at 00:42:30 UTC, Walter Bright wrote: [...] especially if one is writing applications that dynamically adjusts based on the CPU the user is running on. The main trouble comes about when different modules are compiled with different settings. What happens with template

Re: Any usable SIMD implementation?

2016-04-12 Thread Marco Leise via Digitalmars-d
Am Mon, 11 Apr 2016 14:29:11 -0700 schrieb Walter Bright : > On 4/11/2016 7:24 AM, Marco Leise wrote: > > Am Mon, 4 Apr 2016 11:43:58 -0700 > > schrieb Walter Bright : > > > >> On 4/4/2016 9:21 AM, Marco Leise wrote: > >>> To put this to good use, we need a reliable way - basically > >>>

Re: Any usable SIMD implementation?

2016-04-12 Thread Marco Leise via Digitalmars-d
Am Tue, 12 Apr 2016 10:55:18 + schrieb xenon325 : > Have you seen how GCC's function multiversioning [1] ? > > This whole thread is far too low-level for me and I'm not sure if > GCC's dispatcher overhead is OK, but the syntax looks really nice > and it seems to address all of your concerns

Re: Any usable SIMD implementation?

2016-04-12 Thread Marco Leise via Digitalmars-d
The system seems to call CPUID at startup and for every multiversioned function, patch an offset in its dispatcher function. The dispatcher function is then nothing more than a jump realtive to RIP, e.g.: jmpQWORD PTR [rip+0x200bf2] This is as efficient as it gets short of using whole-progr

Re: Any usable SIMD implementation?

2016-04-12 Thread Etienne via Digitalmars-d
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: I'm currently working on a templated arrayop implementation (using RPN to encode ASTs). So far things worked out great, but now I got stuck b/c apparently none of the D compilers has a working SIMD implementation (maybe GDC has bu

Re: Any usable SIMD implementation?

2016-04-12 Thread Walter Bright via Digitalmars-d
On 4/12/2016 9:53 AM, Marco Leise wrote: LDC implements InlineAsm_X86_Any (DMD style asm), so core.cpuid works. GDC is the only compiler that does not implement it. We agree that core.cpuid should provide this information, but what we have now - core.cpuid in a mix with GDC's lack of DMD style as

Re: Any usable SIMD implementation?

2016-04-12 Thread Marco Leise via Digitalmars-d
Am Tue, 12 Apr 2016 13:22:12 -0700 schrieb Walter Bright : > On 4/12/2016 9:53 AM, Marco Leise wrote: > > LDC implements InlineAsm_X86_Any (DMD style asm), so > > core.cpuid works. GDC is the only compiler that does not > > implement it. We agree that core.cpuid should provide this > > information

Re: Any usable SIMD implementation?

2016-04-12 Thread Iain Buclaw via Digitalmars-d
On 12 April 2016 at 22:22, Walter Bright via Digitalmars-d wrote: > On 4/12/2016 9:53 AM, Marco Leise wrote: >> Your look on GCC (and LLVM) may be a bit biased. First of all >> you don't need to tell it exactly which registers to use. A >> rough classification is enough and gives the compiler a go

Re: Any usable SIMD implementation?

2016-04-12 Thread Walter Bright via Digitalmars-d
On 4/12/2016 4:35 PM, Iain Buclaw via Digitalmars-d wrote: It's a step backwards because I can't just say "MUL EAX". I have to tell GCC what register the result gets put in. This is, to my mind, ridiculous. GCC's inline assembler apparently has no knowledge of what the opcodes actually do. asm {

Re: Any usable SIMD implementation?

2016-04-12 Thread Walter Bright via Digitalmars-d
On 4/12/2016 4:29 PM, Marco Leise wrote: Am Tue, 12 Apr 2016 13:22:12 -0700 schrieb Walter Bright : On 4/12/2016 9:53 AM, Marco Leise wrote: LDC implements InlineAsm_X86_Any (DMD style asm), so core.cpuid works. GDC is the only compiler that does not implement it. We agree that core.cpuid shou

Re: Any usable SIMD implementation?

2016-04-13 Thread Iain Buclaw via Digitalmars-d
On 13 April 2016 at 07:59, Walter Bright via Digitalmars-d wrote: > On 4/12/2016 4:35 PM, Iain Buclaw via Digitalmars-d wrote: >> - What dialect am I writing in? (Do I emit mul or mull? eax or %eax?) >> - Some opcodes in IASM have a different name in the assembler (Emitted >> fdivrp as fdivp, and

Re: Any usable SIMD implementation?

2016-04-13 Thread Iain Buclaw via Digitalmars-d
On 13 April 2016 at 08:22, Walter Bright via Digitalmars-d wrote: > On 4/12/2016 4:29 PM, Marco Leise wrote: >> In practice GDC will just replace the invokation with a single >> 'mul' instruction while DMD will emit a call to this 18 >> instructions long function. Now you keep telling me extended

Re: Any usable SIMD implementation?

2016-04-13 Thread Marco Leise via Digitalmars-d
Am Wed, 13 Apr 2016 09:51:25 +0200 schrieb Iain Buclaw via Digitalmars-d : > On 13 April 2016 at 07:59, Walter Bright via Digitalmars-d > wrote: > > But core.cpuid needs to be made to work in GDC, whatever it takes to do so. > > > > Indeed, it's been on my TODO list for a long time, among many

Re: Any usable SIMD implementation?

2016-04-13 Thread Iain Buclaw via Digitalmars-d
On 13 April 2016 at 11:13, Marco Leise via Digitalmars-d wrote: > Am Wed, 13 Apr 2016 09:51:25 +0200 > schrieb Iain Buclaw via Digitalmars-d > : > >> On 13 April 2016 at 07:59, Walter Bright via Digitalmars-d >> wrote: >> > But core.cpuid needs to be made to work in GDC, whatever it takes to do s

Re: Any usable SIMD implementation?

2016-04-13 Thread Marco Leise via Digitalmars-d
Am Wed, 13 Apr 2016 11:21:35 +0200 schrieb Iain Buclaw via Digitalmars-d : > Yes, cpu_supports is a good way to do it as we only need to invoke > __builtin_cpu_init once and cache all values when running 'shared > static this()'. I was under the assumption that GCC already emits an 'early' static

  1   2   >