Re: Any usable SIMD implementation?

2016-08-23 Thread Ilya Yaroshenko via Digitalmars-d
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: I'm currently working on a templated arrayop implementation (using RPN to encode ASTs). So far things worked out great, but now I got stuck b/c apparently none of the D compilers has a working SIMD implementation (maybe GDC has

Re: Any usable SIMD implementation?

2016-05-02 Thread Joe Duarte via Digitalmars-d
On Saturday, 23 April 2016 at 10:40:12 UTC, Johan Engelen wrote: On Monday, 18 April 2016 at 00:27:06 UTC, Joe Duarte wrote: Someone else said talked about marking "Broadwell" and other generation names. As others have said, it's better to specify features. I wanted to chime in with a couple

Re: Any usable SIMD implementation?

2016-04-23 Thread Marco Leise via Digitalmars-d
Am Sat, 23 Apr 2016 10:40:12 + schrieb Johan Engelen : > I have a question perhaps you can comment on? > With LLVM, it is possible to specify something like "+sse3,-sse2" > (I did not test whether this actually results in SSE3 > instructions being used, but no SSE2 instructions).

Re: Any usable SIMD implementation?

2016-04-23 Thread Johan Engelen via Digitalmars-d
On Monday, 18 April 2016 at 00:27:06 UTC, Joe Duarte wrote: Someone else said talked about marking "Broadwell" and other generation names. As others have said, it's better to specify features. I wanted to chime in with a couple of additional examples. Intel's transactional memory

Re: Any usable SIMD implementation?

2016-04-17 Thread Temtaime via Digitalmars-d
On Monday, 18 April 2016 at 00:27:06 UTC, Joe Duarte wrote: On Tuesday, 5 April 2016 at 10:27:46 UTC, Walter Bright wrote: Besides, I think it's a poor design to customize the app for only one SIMD type. A better idea (I've repeated this ad nauseum over the years) is to have n modules, one for

Re: Any usable SIMD implementation?

2016-04-17 Thread Joe Duarte via Digitalmars-d
On Tuesday, 5 April 2016 at 10:27:46 UTC, Walter Bright wrote: Besides, I think it's a poor design to customize the app for only one SIMD type. A better idea (I've repeated this ad nauseum over the years) is to have n modules, one for each supported SIMD type. Compile and link all of them in,

Re: Any usable SIMD implementation?

2016-04-17 Thread Marco Leise via Digitalmars-d
Am Sat, 16 Apr 2016 21:46:08 -0700 schrieb Walter Bright : > On 4/16/2016 2:40 PM, Marco Leise wrote: > > Tell me again, what's more elgant ! > > If I wanted to write in assembler, I wouldn't write in a high level language, > especially a weird one like GNU

Re: Any usable SIMD implementation?

2016-04-16 Thread Walter Bright via Digitalmars-d
On 4/16/2016 2:40 PM, Marco Leise wrote: Tell me again, what's more elgant ! If I wanted to write in assembler, I wouldn't write in a high level language, especially a weird one like GNU version.

Re: Any usable SIMD implementation?

2016-04-16 Thread Marco Leise via Digitalmars-d
Am Tue, 12 Apr 2016 23:22:37 -0700 schrieb Walter Bright : > >"mulq %[y]" > >: "=a" tmp.lo, "=d" tmp.hi : "a" x, [y] "rm" y; > > I don't see anything elegant about those lines, starting with "mulq" is not > in > any of the AMD or Intel CPU

Re: Any usable SIMD implementation?

2016-04-16 Thread Marco Leise via Digitalmars-d
Am Fri, 15 Apr 2016 18:54:12 + schrieb jmh530 : > On Tuesday, 12 April 2016 at 10:55:18 UTC, xenon325 wrote: > > > > Have you seen how GCC's function multiversioning [1] ? > > > > I've been thinking about the gcc multiversioning since you > mentioned it

Re: Any usable SIMD implementation?

2016-04-15 Thread jmh530 via Digitalmars-d
On Tuesday, 12 April 2016 at 10:55:18 UTC, xenon325 wrote: Have you seen how GCC's function multiversioning [1] ? I've been thinking about the gcc multiversioning since you mentioned it previously. I keep thinking about how the optimal algorithm for something like matrix multiplication

Re: Any usable SIMD implementation?

2016-04-15 Thread Johan Engelen via Digitalmars-d
On Sunday, 3 April 2016 at 07:39:00 UTC, Manu wrote: On 3 April 2016 at 16:14, 9il via Digitalmars-d wrote: Is it possible to introduce compile time information about target platform? I am working on BLAS from scratch implementation. And it is no hope to create

Re: Any usable SIMD implementation?

2016-04-14 Thread Walter Bright via Digitalmars-d
On 4/14/2016 1:21 AM, Iain Buclaw via Digitalmars-d wrote: An alternative interface needs to be invented anyway for other CPUs. That would be fine. But there is no reason to redo core.cpuid for x86 machines.

Re: Any usable SIMD implementation?

2016-04-14 Thread Iain Buclaw via Digitalmars-d
On 13 April 2016 at 13:14, Walter Bright via Digitalmars-d wrote: > On 4/13/2016 3:58 AM, Marco Leise wrote: >> >> How about this style as an alternative?: >> >> immutable bool mmx; >> immutable bool hasPopcnt; >> >> shared static this() >> { >> import

Re: Any usable SIMD implementation?

2016-04-13 Thread Walter Bright via Digitalmars-d
On 4/13/2016 5:47 AM, Marco Leise wrote: Yes, they are all @property and a substitution with direct access to the globals will work around GDC's lack of cross-module inlining. Otherwise these feature checks which might be used in hot code, are more costly than they should be. I hate when things

Re: Any usable SIMD implementation?

2016-04-13 Thread Marco Leise via Digitalmars-d
Am Wed, 13 Apr 2016 04:14:48 -0700 schrieb Walter Bright : > On 4/13/2016 3:58 AM, Marco Leise wrote: > > How about this style as an alternative?: > > > > immutable bool mmx; > > immutable bool hasPopcnt; > > > > shared static this() > > { > > import gcc.builtins;

Re: Any usable SIMD implementation?

2016-04-13 Thread Walter Bright via Digitalmars-d
On 4/13/2016 3:58 AM, Marco Leise wrote: How about this style as an alternative?: immutable bool mmx; immutable bool hasPopcnt; shared static this() { import gcc.builtins; mmx = __builtin_cpu_supports("mmx" ) > 0; hasPopcnt = __builtin_cpu_supports("popcnt") > 0; }

Re: Any usable SIMD implementation?

2016-04-13 Thread Marco Leise via Digitalmars-d
Am Wed, 13 Apr 2016 11:21:35 +0200 schrieb Iain Buclaw via Digitalmars-d : > Yes, cpu_supports is a good way to do it as we only need to invoke > __builtin_cpu_init once and cache all values when running 'shared > static this()'. I was under the assumption that GCC

Re: Any usable SIMD implementation?

2016-04-13 Thread Iain Buclaw via Digitalmars-d
On 13 April 2016 at 11:13, Marco Leise via Digitalmars-d wrote: > Am Wed, 13 Apr 2016 09:51:25 +0200 > schrieb Iain Buclaw via Digitalmars-d > : > >> On 13 April 2016 at 07:59, Walter Bright via Digitalmars-d >>

Re: Any usable SIMD implementation?

2016-04-13 Thread Marco Leise via Digitalmars-d
Am Wed, 13 Apr 2016 09:51:25 +0200 schrieb Iain Buclaw via Digitalmars-d : > On 13 April 2016 at 07:59, Walter Bright via Digitalmars-d > wrote: > > But core.cpuid needs to be made to work in GDC, whatever it takes to do so. > > > >

Re: Any usable SIMD implementation?

2016-04-13 Thread Iain Buclaw via Digitalmars-d
On 13 April 2016 at 08:22, Walter Bright via Digitalmars-d wrote: > On 4/12/2016 4:29 PM, Marco Leise wrote: >> In practice GDC will just replace the invokation with a single >> 'mul' instruction while DMD will emit a call to this 18 >> instructions long function. Now

Re: Any usable SIMD implementation?

2016-04-13 Thread Iain Buclaw via Digitalmars-d
On 13 April 2016 at 07:59, Walter Bright via Digitalmars-d wrote: > On 4/12/2016 4:35 PM, Iain Buclaw via Digitalmars-d wrote: >> - What dialect am I writing in? (Do I emit mul or mull? eax or %eax?) >> - Some opcodes in IASM have a different name in the assembler

Re: Any usable SIMD implementation?

2016-04-13 Thread Walter Bright via Digitalmars-d
On 4/12/2016 4:29 PM, Marco Leise wrote: Am Tue, 12 Apr 2016 13:22:12 -0700 schrieb Walter Bright : On 4/12/2016 9:53 AM, Marco Leise wrote: LDC implements InlineAsm_X86_Any (DMD style asm), so core.cpuid works. GDC is the only compiler that does not implement it.

Re: Any usable SIMD implementation?

2016-04-13 Thread Walter Bright via Digitalmars-d
On 4/12/2016 4:35 PM, Iain Buclaw via Digitalmars-d wrote: It's a step backwards because I can't just say "MUL EAX". I have to tell GCC what register the result gets put in. This is, to my mind, ridiculous. GCC's inline assembler apparently has no knowledge of what the opcodes actually do. asm

Re: Any usable SIMD implementation?

2016-04-12 Thread Iain Buclaw via Digitalmars-d
On 12 April 2016 at 22:22, Walter Bright via Digitalmars-d wrote: > On 4/12/2016 9:53 AM, Marco Leise wrote: >> Your look on GCC (and LLVM) may be a bit biased. First of all >> you don't need to tell it exactly which registers to use. A >> rough classification is

Re: Any usable SIMD implementation?

2016-04-12 Thread Marco Leise via Digitalmars-d
Am Tue, 12 Apr 2016 13:22:12 -0700 schrieb Walter Bright : > On 4/12/2016 9:53 AM, Marco Leise wrote: > > LDC implements InlineAsm_X86_Any (DMD style asm), so > > core.cpuid works. GDC is the only compiler that does not > > implement it. We agree that core.cpuid should

Re: Any usable SIMD implementation?

2016-04-12 Thread Walter Bright via Digitalmars-d
On 4/12/2016 9:53 AM, Marco Leise wrote: LDC implements InlineAsm_X86_Any (DMD style asm), so core.cpuid works. GDC is the only compiler that does not implement it. We agree that core.cpuid should provide this information, but what we have now - core.cpuid in a mix with GDC's lack of DMD style

Re: Any usable SIMD implementation?

2016-04-12 Thread Etienne via Digitalmars-d
On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote: I'm currently working on a templated arrayop implementation (using RPN to encode ASTs). So far things worked out great, but now I got stuck b/c apparently none of the D compilers has a working SIMD implementation (maybe GDC has

Re: Any usable SIMD implementation?

2016-04-12 Thread Marco Leise via Digitalmars-d
The system seems to call CPUID at startup and for every multiversioned function, patch an offset in its dispatcher function. The dispatcher function is then nothing more than a jump realtive to RIP, e.g.: jmpQWORD PTR [rip+0x200bf2] This is as efficient as it gets short of using

Re: Any usable SIMD implementation?

2016-04-12 Thread Marco Leise via Digitalmars-d
Am Tue, 12 Apr 2016 10:55:18 + schrieb xenon325 : > Have you seen how GCC's function multiversioning [1] ? > > This whole thread is far too low-level for me and I'm not sure if > GCC's dispatcher overhead is OK, but the syntax looks really nice > and it seems to

Re: Any usable SIMD implementation?

2016-04-12 Thread Marco Leise via Digitalmars-d
Am Mon, 11 Apr 2016 14:29:11 -0700 schrieb Walter Bright : > On 4/11/2016 7:24 AM, Marco Leise wrote: > > Am Mon, 4 Apr 2016 11:43:58 -0700 > > schrieb Walter Bright : > > > >> On 4/4/2016 9:21 AM, Marco Leise wrote: > >>> To put

Re: Any usable SIMD implementation?

2016-04-12 Thread xenon325 via Digitalmars-d
On Thursday, 7 April 2016 at 00:42:30 UTC, Walter Bright wrote: [...] especially if one is writing applications that dynamically adjusts based on the CPU the user is running on. The main trouble comes about when different modules are compiled with different settings. What happens with template

Re: Any usable SIMD implementation?

2016-04-11 Thread Walter Bright via Digitalmars-d
On 4/11/2016 7:24 AM, Marco Leise wrote: Am Mon, 4 Apr 2016 11:43:58 -0700 schrieb Walter Bright : On 4/4/2016 9:21 AM, Marco Leise wrote: To put this to good use, we need a reliable way - basically a global variable - to check for SSE4 (or POPCNT, etc.).

Re: Any usable SIMD implementation?

2016-04-11 Thread Marco Leise via Digitalmars-d
Am Wed, 6 Apr 2016 20:29:21 -0700 schrieb Walter Bright : > On 4/6/2016 7:25 PM, Manu via Digitalmars-d wrote: > > TL;DR, defining architectures with an intel-centric naming convention > > is a very bad idea. > > You're not making a good case for a standard language

Re: Any usable SIMD implementation?

2016-04-11 Thread Marco Leise via Digitalmars-d
Am Mon, 4 Apr 2016 13:29:11 -0700 schrieb Walter Bright : > On 4/4/2016 7:02 AM, 9il wrote: > >> What kind of information? > > > > Target cpu configuration: > > - CPU architecture (done) > > Done. > > > - Count of FP/Integer registers > > ?? > > > - Allowed

Re: Any usable SIMD implementation?

2016-04-11 Thread Marco Leise via Digitalmars-d
Am Mon, 4 Apr 2016 11:43:58 -0700 schrieb Walter Bright : > On 4/4/2016 9:21 AM, Marco Leise wrote: > >To put this to good use, we need a reliable way - basically > >a global variable - to check for SSE4 (or POPCNT, etc.). What > >we have now does not work

Re: Any usable SIMD implementation?

2016-04-11 Thread Marco Leise via Digitalmars-d
Am Mon, 04 Apr 2016 18:35:26 + schrieb 9il : > @attribute("target", "+sse4")) would not work well for BLAS. BLAS > needs compile time constants. This is very important because BLAS > can be 95% portable, so I just need to write a code that would be > optimized

Re: Any usable SIMD implementation?

2016-04-07 Thread Walter Bright via Digitalmars-d
On 4/7/2016 3:15 AM, Johannes Pfau wrote: The problem is that march=x can set more than one feature flag. So instead of gdc -march=armv7-a you have to do gdc -march=armv7-a -fversion=ARM_FEATURE_CRC32 -fversion=ARM_FEATURE_UNALIGNED ... Sou have to know exactly which features are supported for

Re: Any usable SIMD implementation?

2016-04-07 Thread Walter Bright via Digitalmars-d
On 4/7/2016 5:27 PM, Manu via Digitalmars-d wrote: You'll have noticed that C++ interaction is my recent focus, since that's directly related to my current day-job, and the path that I need to solve now to get D into my work. We recognize C++ interoperability to be a key feature of D. I hope

Re: Any usable SIMD implementation?

2016-04-07 Thread Walter Bright via Digitalmars-d
On 4/7/2016 3:52 AM, Kai Nacke wrote: On Thursday, 7 April 2016 at 03:27:31 UTC, Walter Bright wrote: Then, void app(int simd)() { ... my fabulous app ... } int main() { auto fpu = core.cpuid.getfpu(); switch (fpu) { case SIMD: app!(SIMD)(); break; case

Re: Any usable SIMD implementation?

2016-04-07 Thread Manu via Digitalmars-d
On 7 April 2016 at 13:27, Walter Bright via Digitalmars-d wrote: > On 4/6/2016 7:43 PM, Manu via Digitalmars-d wrote: >>> >>> 1. This has been characterized as a blocker, it is not, as it does not >>> impede writing code that takes advantage of various SIMD code

Re: Any usable SIMD implementation?

2016-04-07 Thread Johan Engelen via Digitalmars-d
On Thursday, 7 April 2016 at 14:46:06 UTC, Johannes Pfau wrote: Am Thu, 07 Apr 2016 13:27:05 + schrieb Johan Engelen : On Thursday, 7 April 2016 at 11:25:47 UTC, Johannes Pfau wrote: > Am Thu, 07 Apr 2016 10:52:42 + > schrieb Kai Nacke : > >> glibc has a

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Thu, 07 Apr 2016 13:27:05 + schrieb Johan Engelen : > On Thursday, 7 April 2016 at 11:25:47 UTC, Johannes Pfau wrote: > > Am Thu, 07 Apr 2016 10:52:42 + > > schrieb Kai Nacke : > > > >> glibc has a special mechanism for resolving the called > >> function

Re: Any usable SIMD implementation?

2016-04-07 Thread Johan Engelen via Digitalmars-d
On Thursday, 7 April 2016 at 11:25:47 UTC, Johannes Pfau wrote: Am Thu, 07 Apr 2016 10:52:42 + schrieb Kai Nacke : glibc has a special mechanism for resolving the called function during loading. See the section on the GNU Indirect Function Mechanism here:

Re: Any usable SIMD implementation?

2016-04-07 Thread 9il via Digitalmars-d
On Thursday, 7 April 2016 at 12:35:51 UTC, jmh530 wrote: On Thursday, 7 April 2016 at 10:03:50 UTC, 9il wrote: This is not true for BLAS based on D. Perhaps if you provide him a simplified example he might see what you're talking about? He know what I am talking about. This is about

Re: Any usable SIMD implementation?

2016-04-07 Thread jmh530 via Digitalmars-d
On Thursday, 7 April 2016 at 10:03:50 UTC, 9il wrote: This is not true for BLAS based on D. Perhaps if you provide him a simplified example he might see what you're talking about?

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Thu, 07 Apr 2016 10:52:42 + schrieb Kai Nacke : > On Thursday, 7 April 2016 at 03:27:31 UTC, Walter Bright wrote: > > Then, > > > > void app(int simd)() { ... my fabulous app ... } > > > > int main() { > > auto fpu = core.cpuid.getfpu(); > > switch

Re: Any usable SIMD implementation?

2016-04-07 Thread Kai Nacke via Digitalmars-d
On Thursday, 7 April 2016 at 03:27:31 UTC, Walter Bright wrote: Then, void app(int simd)() { ... my fabulous app ... } int main() { auto fpu = core.cpuid.getfpu(); switch (fpu) { case SIMD: app!(SIMD)(); break; case SIMD4: app!(SIMD4)(); break;

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Thu, 7 Apr 2016 02:41:06 -0700 schrieb Walter Bright : > > 3. This would not solve the problem for generic BLAS implementation > > for Phobos at all! How you would force compiler to USE and NOT USE > > specific vector permutations for example in the same object

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Wed, 6 Apr 2016 20:27:31 -0700 schrieb Walter Bright : > On 4/6/2016 7:43 PM, Manu via Digitalmars-d wrote: > >> 1. This has been characterized as a blocker, it is not, as it does > >> not impede writing code that takes advantage of various SIMD code > >> generation

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Wed, 6 Apr 2016 17:42:30 -0700 schrieb Walter Bright : > On 4/6/2016 5:36 AM, Manu via Digitalmars-d wrote: > > But at very least, the important detail is that the version ID's are > > standardised and shared among all compilers. > > It's a reasonable suggestion;

Re: Any usable SIMD implementation?

2016-04-07 Thread 9il via Digitalmars-d
On Thursday, 7 April 2016 at 09:41:06 UTC, Walter Bright wrote: On 4/7/2016 12:59 AM, 9il wrote: 1. Executable size will grow with every instruction set release Yes, and nobody cares. With virtual memory and demand loading, unexecuted code will never be loaded off of disk and will never

Re: Any usable SIMD implementation?

2016-04-07 Thread Johannes Pfau via Digitalmars-d
Am Thu, 7 Apr 2016 12:25:03 +1000 schrieb Manu via Digitalmars-d : > On 6 April 2016 at 23:26, 9il via Digitalmars-d > wrote: > > On Wednesday, 6 April 2016 at 12:40:04 UTC, Manu wrote: > >> > >> On 6 April 2016 at 07:41, Johan Engelen

Re: Any usable SIMD implementation?

2016-04-07 Thread Walter Bright via Digitalmars-d
On 4/7/2016 12:59 AM, 9il wrote: 1. Executable size will grow with every instruction set release Yes, and nobody cares. With virtual memory and demand loading, unexecuted code will never be loaded off of disk and will never consume memory space. And with a 64 bit address space, there will

Re: Any usable SIMD implementation?

2016-04-07 Thread 9il via Digitalmars-d
On Thursday, 7 April 2016 at 03:27:31 UTC, Walter Bright wrote: I can understand that it might be demotivating for you, but that is not a blocker. A blocker has no reasonable workaround. This has a trivial workaround: gdc -simd=AFX foo.d becomes: gdc -simd=AFX -version=AFX foo.d

Re: Any usable SIMD implementation?

2016-04-06 Thread Walter Bright via Digitalmars-d
On 4/6/2016 7:25 PM, Manu via Digitalmars-d wrote: Sure, but it's an ongoing maintenance task, constantly requiring population with metadata for new processors that become available. Remember, most processors are arm processors, and there are like 20 manufacturers of arm chips, and many of those

Re: Any usable SIMD implementation?

2016-04-06 Thread Walter Bright via Digitalmars-d
On 4/6/2016 7:43 PM, Manu via Digitalmars-d wrote: 1. This has been characterized as a blocker, it is not, as it does not impede writing code that takes advantage of various SIMD code generation at compile time. It's sufficiently blocking that I have not felt like working any further without

Re: Any usable SIMD implementation?

2016-04-06 Thread Manu via Digitalmars-d
On 7 April 2016 at 10:42, Walter Bright via Digitalmars-d wrote: > On 4/6/2016 5:36 AM, Manu via Digitalmars-d wrote: >> >> But at very least, the important detail is that the version ID's are >> standardised and shared among all compilers. > > > It's a reasonable

Re: Any usable SIMD implementation?

2016-04-06 Thread Manu via Digitalmars-d
On 6 April 2016 at 23:26, 9il via Digitalmars-d wrote: > On Wednesday, 6 April 2016 at 12:40:04 UTC, Manu wrote: >> >> On 6 April 2016 at 07:41, Johan Engelen via Digitalmars-d >> wrote: >>> >>> [...] >> >> >> With respect to SIMD,

Re: Any usable SIMD implementation?

2016-04-06 Thread Walter Bright via Digitalmars-d
On 4/6/2016 5:36 AM, Manu via Digitalmars-d wrote: But at very least, the important detail is that the version ID's are standardised and shared among all compilers. It's a reasonable suggestion; some points: 1. This has been characterized as a blocker, it is not, as it does not impede

Re: Any usable SIMD implementation?

2016-04-06 Thread 9il via Digitalmars-d
On Wednesday, 6 April 2016 at 14:31:58 UTC, Johan Engelen wrote: Probably the most difficult part is defining an API. Ilya made a start here: http://forum.dlang.org/post/eodutgruoofruperr...@forum.dlang.org (but he doesn't like his earlier API "bool a = __target("broadwell")" any more ;-P , I

Re: Any usable SIMD implementation?

2016-04-06 Thread Johan Engelen via Digitalmars-d
On Wednesday, 6 April 2016 at 13:26:51 UTC, 9il wrote: On Wednesday, 6 April 2016 at 12:40:04 UTC, Manu wrote: On 6 April 2016 at 07:41, Johan Engelen via Digitalmars-d wrote: [...] With respect to SIMD, knowing a processor model like 'broadwell' is not

Re: Any usable SIMD implementation?

2016-04-06 Thread 9il via Digitalmars-d
On Wednesday, 6 April 2016 at 12:40:04 UTC, Manu wrote: On 6 April 2016 at 07:41, Johan Engelen via Digitalmars-d wrote: [...] With respect to SIMD, knowing a processor model like 'broadwell' is not helpful, since we really want to know 'sse4'. If we know

Re: Any usable SIMD implementation?

2016-04-06 Thread Manu via Digitalmars-d
On 6 April 2016 at 07:41, Johan Engelen via Digitalmars-d wrote: > On Tuesday, 5 April 2016 at 21:29:41 UTC, Walter Bright wrote: >> >> >> I want to make it clear that dmd does not generate AFX specific code, has >> no switch to enable AFX code generation and has no

Re: Any usable SIMD implementation?

2016-04-06 Thread Manu via Digitalmars-d
On 5 April 2016 at 20:30, Walter Bright via Digitalmars-d wrote: > On 4/5/2016 2:39 AM, 9il wrote: >> >> On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: >>> >>> On 4/4/2016 11:10 PM, 9il wrote: >>> I still don't understand why you cannot just set

Re: Any usable SIMD implementation?

2016-04-06 Thread jmh530 via Digitalmars-d
On Wednesday, 6 April 2016 at 06:11:15 UTC, 9il wrote: Yes, only few of us would use this feature directly, however, many of us would use this under-the-hood in BLAS/SIMD oriented part of Phobos. Especially since everyone says to use LDC for the fastest code anyway...

Re: Any usable SIMD implementation?

2016-04-06 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 21:41:46 UTC, Johan Engelen wrote: On Tuesday, 5 April 2016 at 21:29:41 UTC, Walter Bright wrote: I want to make it clear that dmd does not generate AFX specific code, has no switch to enable AFX code generation and has no basis for setting predefined version

Re: Any usable SIMD implementation?

2016-04-06 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 21:29:41 UTC, Walter Bright wrote: On 4/5/2016 4:07 AM, 9il wrote: On Tuesday, 5 April 2016 at 10:30:19 UTC, Walter Bright wrote: On 4/5/2016 2:39 AM, 9il wrote: On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: 1. This would help to eliminate

Re: Any usable SIMD implementation?

2016-04-06 Thread 9il via Digitalmars-d
On Wednesday, 6 April 2016 at 00:45:54 UTC, Walter Bright wrote: On 4/5/2016 4:17 AM, 9il wrote: What wrong for scientist to write `-mcpu=native`? Because it would affect all the code in the module and every template it imports, which is a problem if you are using 'static if' and want to

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/5/2016 4:17 AM, 9il wrote: What wrong for scientist to write `-mcpu=native`? Because it would affect all the code in the module and every template it imports, which is a problem if you are using 'static if' and want to compile different pieces with different settings.

Re: Any usable SIMD implementation?

2016-04-05 Thread Johan Engelen via Digitalmars-d
On Tuesday, 5 April 2016 at 21:29:41 UTC, Walter Bright wrote: I want to make it clear that dmd does not generate AFX specific code, has no switch to enable AFX code generation and has no basis for setting predefined version identifiers for it. How about adding a "__target(...)"

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/5/2016 4:07 AM, 9il wrote: On Tuesday, 5 April 2016 at 10:30:19 UTC, Walter Bright wrote: On 4/5/2016 2:39 AM, 9il wrote: On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: 1. This would help to eliminate configuration bugs. 2. This would reduce work for users and simplified

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 10:27:46 UTC, Walter Bright wrote: On 4/5/2016 2:03 AM, John Colvin wrote: There's a line between trying to standardize everything and letting add-on libraries be free to innovate. Besides, I think it's a poor design to customize the app for only one SIMD type. A

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 10:30:19 UTC, Walter Bright wrote: On 4/5/2016 2:39 AM, 9il wrote: On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: 1. This would help to eliminate configuration bugs. 2. This would reduce work for users and simplified user experience. 3. This is

Re: Any usable SIMD implementation?

2016-04-05 Thread Johan Engelen via Digitalmars-d
On Tuesday, 5 April 2016 at 09:39:21 UTC, 9il wrote: 3. This is possible and not very hard to implement if I am not wrong. Last time I looked into this (related to implementing @target, see [1]), I only found some Clang code dealing with this, but now I found LLVM functions about

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/5/2016 2:39 AM, 9il wrote: On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: On 4/4/2016 11:10 PM, 9il wrote: I still don't understand why you cannot just set '-version=xxx' on the command line and then switch off that version in your custom code. I can do it, however I

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/5/2016 2:03 AM, John Colvin wrote: So you're suggesting that libraries invent their own list of versions for specific architectures / CPU features, which the user then has to specify somehow on the command line? I want to be able to write code that uses standardised versions that work

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: On 4/4/2016 11:10 PM, 9il wrote: I still don't understand why you cannot just set '-version=xxx' on the command line and then switch off that version in your custom code. I can do it, however I would like to get this information

Re: Any usable SIMD implementation?

2016-04-05 Thread John Colvin via Digitalmars-d
On Tuesday, 5 April 2016 at 08:34:32 UTC, Walter Bright wrote: On 4/4/2016 11:10 PM, 9il wrote: It is impossible to deduct from that combination that Xeon Phi has 32 FP registers. Since dmd doesn't generate specific code for a Xeon Phi, having a compile time switch for it is meaningless.

Re: Any usable SIMD implementation?

2016-04-05 Thread Walter Bright via Digitalmars-d
On 4/4/2016 11:10 PM, 9il wrote: It is impossible to deduct from that combination that Xeon Phi has 32 FP registers. Since dmd doesn't generate specific code for a Xeon Phi, having a compile time switch for it is meaningless. "Since the compiler never generates AVX or AVX2" - this is

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Monday, 4 April 2016 at 21:13:30 UTC, jmh530 wrote: On Monday, 4 April 2016 at 21:05:44 UTC, 9il wrote: OpenBLAS kernels is 30 MB of assembler code! So we would be able to replace it once and for a very long time with Phobos. Are you familiar with this project at all?

Re: Any usable SIMD implementation?

2016-04-05 Thread 9il via Digitalmars-d
On Monday, 4 April 2016 at 22:34:06 UTC, Walter Bright wrote: On 4/4/2016 2:05 PM, 9il wrote: - Count of FP/Integer registers ?? How many general purpose registers, SIMD Floating Point registers, SIMD Integer registers have a CPU? These are deducible from X86, X86_64, and SIMD version

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 2:05 PM, 9il wrote: - Count of FP/Integer registers ?? How many general purpose registers, SIMD Floating Point registers, SIMD Integer registers have a CPU? These are deducible from X86, X86_64, and SIMD version identifiers. Needs to know is it AVX or AVX2 in compile time

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 2:11 PM, jmh530 wrote: version(D_SIMD) will tell you when SIMD is implemented, but not what type of SIMD. The first SIMD level. For instance, if I am on a machine that can use AVX2 instructions, then code in a version(D_SIMD) block will execute, but it should also execute if the

Re: Any usable SIMD implementation?

2016-04-04 Thread jmh530 via Digitalmars-d
On Monday, 4 April 2016 at 21:05:44 UTC, 9il wrote: OpenBLAS kernels is 30 MB of assembler code! So we would be able to replace it once and for a very long time with Phobos. Are you familiar with this project at all? https://github.com/flame/blis

Re: Any usable SIMD implementation?

2016-04-04 Thread jmh530 via Digitalmars-d
On Monday, 4 April 2016 at 20:29:11 UTC, Walter Bright wrote: - Allowed sets of instructions: for example, AVX2, FMA4 Done. D_SIMD I'm not a SIMD expert, I've only played around with SIMD a little, but this confuses me. version(D_SIMD) will tell you when SIMD is implemented, but not

Re: Any usable SIMD implementation?

2016-04-04 Thread 9il via Digitalmars-d
On Monday, 4 April 2016 at 20:29:11 UTC, Walter Bright wrote: On 4/4/2016 7:02 AM, 9il wrote: What kind of information? Target cpu configuration: - CPU architecture (done) Done. - Count of FP/Integer registers ?? How many general purpose registers, SIMD Floating Point registers, SIMD

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 7:02 AM, 9il wrote: What kind of information? Target cpu configuration: - CPU architecture (done) Done. - Count of FP/Integer registers ?? - Allowed sets of instructions: for example, AVX2, FMA4 Done. D_SIMD - Compiler optimization options (for math) Moot. DMD does

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 12:55 PM, ZombineDev wrote: I believe the issue is fixed (for DMD) with a documentation improvement. I believe the problem is that you can't rely on D_SIMD that SSE4, FMA, AVX2, AVX-512, etc. are available on the target platform. See also

Re: Any usable SIMD implementation?

2016-04-04 Thread ZombineDev via Digitalmars-d
On Monday, 4 April 2016 at 19:43:43 UTC, Walter Bright wrote: On 4/4/2016 10:23 AM, Jack Stouffer wrote: On Sunday, 3 April 2016 at 07:39:00 UTC, Manu wrote: My SIMD implementation has been blocked on that for years too. I need to know the SIMD level flags passed to the compiler at least,

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 10:23 AM, Jack Stouffer wrote: On Sunday, 3 April 2016 at 07:39:00 UTC, Manu wrote: My SIMD implementation has been blocked on that for years too. I need to know the SIMD level flags passed to the compiler at least, and DMD needs to introduce the concept. I made a bug to track

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 9:21 AM, Marco Leise wrote: To put this to good use, we need a reliable way - basically a global variable - to check for SSE4 (or POPCNT, etc.). What we have now does not work across all compilers. http://dlang.org/phobos/core_cpuid.html

Re: Any usable SIMD implementation?

2016-04-04 Thread Walter Bright via Digitalmars-d
On 4/4/2016 10:27 AM, jmh530 wrote: On Monday, 4 April 2016 at 17:23:49 UTC, Jack Stouffer wrote: I made a bug to track this problem: https://issues.dlang.org/show_bug.cgi?id=15873 You might add link to this thread and github where he made the original comment..

Re: Any usable SIMD implementation?

2016-04-04 Thread 9il via Digitalmars-d
On Monday, 4 April 2016 at 16:21:15 UTC, Marco Leise wrote: Am Mon, 04 Apr 2016 14:02:03 + schrieb 9il : - On amd64, whether floating-point math is handled by the FPU or SSE. When emulating floating-point, e.g. for float-to-string and string-to-float code, it is

Re: Any usable SIMD implementation?

2016-04-04 Thread jmh530 via Digitalmars-d
On Monday, 4 April 2016 at 17:23:49 UTC, Jack Stouffer wrote: I made a bug to track this problem: https://issues.dlang.org/show_bug.cgi?id=15873 You might add link to this thread and github where he made the original comment..

Re: Any usable SIMD implementation?

2016-04-04 Thread Jack Stouffer via Digitalmars-d
On Sunday, 3 April 2016 at 07:39:00 UTC, Manu wrote: My SIMD implementation has been blocked on that for years too. I need to know the SIMD level flags passed to the compiler at least, and DMD needs to introduce the concept. I made a bug to track this problem:

Re: Any usable SIMD implementation?

2016-04-04 Thread Marco Leise via Digitalmars-d
Am Mon, 04 Apr 2016 14:02:03 + schrieb 9il : > Target cpu configuration: > - CPU architecture (done) > - Count of FP/Integer registers > - Allowed sets of instructions: for example, AVX2, FMA4 > - Compiler optimization options (for math) > > Ilya - On amd64,

Re: Any usable SIMD implementation?

2016-04-04 Thread 9il via Digitalmars-d
On Sunday, 3 April 2016 at 06:33:13 UTC, Iain Buclaw wrote: On 3 Apr 2016 8:15 am, "9il via Digitalmars-d" wrote: Hello Martin, Is it possible to introduce compile time information about target platform? I am working on BLAS from scratch implementation. And it

Re: Any usable SIMD implementation?

2016-04-04 Thread Marco Leise via Digitalmars-d
Am Sun, 03 Apr 2016 06:14:23 + schrieb 9il : > Hello Martin, > > Is it possible to introduce compile time information about target > platform? I am working on BLAS from scratch implementation. And > it is no hope to create something useable without CT information

Re: Any usable SIMD implementation?

2016-04-03 Thread Walter Bright via Digitalmars-d
On 4/3/2016 7:12 PM, Jack Stouffer wrote: On Sunday, 3 April 2016 at 22:00:51 UTC, Walter Bright wrote: There is no issue I can find about being blocked for years on SIMD flags. I guarantee you that if you never report the problems you're having, you will suffer in silence and they will not get

  1   2   >