[Mono-dev] Mono.SIMD supported platforms
Hi. I can't find any information on what platforms are currently supported by Mono.Simd. In particular, is Mono.Simd hardware accelerated on iPhone and Android? ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.SIMD supported platforms
Only x86 and amd64 are supported. On Mon, Apr 16, 2012 at 9:22 AM, Alexander Mezin mezin.alexan...@gmail.comwrote: Hi. I can't find any information on what platforms are currently supported by Mono.Simd. In particular, is Mono.Simd hardware accelerated on iPhone and Android? ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
This patch is contributed under the MIT license I don't have push access to the main repository, so please commit the patch yourself. This is an oversight, could I have your GitHub account so I can add you to the group? ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
This patch is contributed under the MIT license I don't have push access to the main repository, so please commit the patch yourself. Ah, never mind, found you: robert-j You are now part of the Mono commit team. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
Hi Rodrigo, On 07.09.2010 02:32, Rodrigo Kumpera wrote: Robert, can you commit your patch after you state the license of it? Either via email on MDL or on the commit message. This patch is contributed under the MIT license I don't have push access to the main repository, so please commit the patch yourself. Thanks, Robert ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
Robert, can you commit your patch after you state the license of it? Either via email on MDL or on the commit message. On Wed, Aug 25, 2010 at 11:56 PM, Rodrigo Kumpera kump...@gmail.com wrote: a On Mon, Aug 23, 2010 at 7:01 PM, Robert Jordan robe...@gmx.net wrote: On 23.08.2010 23:13, Rodrigo Kumpera wrote: I think it's easier to catch the security exception under MS since its accell mode is None anyway. I had to move icall's call site outside the .cctor and mark the call site's method as non-inlineable to make this work. Thanks for the hint. http://github.com/robert-j/mono/commit/0450be20f52c64e2788287400fe1eb9ff9be6817 Robert Patch looks good. Please just state the license it's under and you can commit it. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
I think you missed the important part of that last email. If wanted you to state the license of the patch, then commit it :) Alan. On 27 Aug 2010 02:10, Jerry Maine - KF5ADY crashfou...@gmail.com wrote: Please, I found this bug to be very annoying as it hampers the use of dynamic languagues with Mono.Simd. I found this bug trying to use mono.simd in ironpython. On 08/25/2010 09:56 PM, Rodrigo Kumpera wrote: a On Mon, Aug 23, 2010 at 7:01 PM, Robert Jordan robe...@gmx.net wrote: On 23.08.20... ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
Please, I found this bug to be very annoying as it hampers the use of dynamic languagues with Mono.Simd. I found this bug trying to use mono.simd in ironpython. On 08/25/2010 09:56 PM, Rodrigo Kumpera wrote: a On Mon, Aug 23, 2010 at 7:01 PM, Robert Jordan robe...@gmx.net mailto:robe...@gmx.net wrote: On 23.08.2010 23:13, Rodrigo Kumpera wrote: I think it's easier to catch the security exception under MS since its accell mode is None anyway. I had to move icall's call site outside the .cctor and mark the call site's method as non-inlineable to make this work. Thanks for the hint. http://github.com/robert-j/mono/commit/0450be20f52c64e2788287400fe1eb9ff9be6817 Robert Patch looks good. Please just state the license it's under and you can commit it. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
Well, I tried to make find a place would have system properties store where a key like mono.simd.accel could be used to get back the available acceleration capabilities. It could make the code a bit cleaner with not making a internal method call inside the Mono.Simd assembly. Any ideas where that could be? On 08/23/2010 05:01 PM, Robert Jordan wrote: On 23.08.2010 23:13, Rodrigo Kumpera wrote: I think it's easier to catch the security exception under MS since its accell mode is None anyway. I had to move icall's call site outside the .cctor and mark the call site's method as non-inlineable to make this work. Thanks for the hint. http://github.com/robert-j/mono/commit/0450be20f52c64e2788287400fe1eb9ff9be6817 Robert ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
a On Mon, Aug 23, 2010 at 7:01 PM, Robert Jordan robe...@gmx.net wrote: On 23.08.2010 23:13, Rodrigo Kumpera wrote: I think it's easier to catch the security exception under MS since its accell mode is None anyway. I had to move icall's call site outside the .cctor and mark the call site's method as non-inlineable to make this work. Thanks for the hint. http://github.com/robert-j/mono/commit/0450be20f52c64e2788287400fe1eb9ff9be6817 Robert Patch looks good. Please just state the license it's under and you can commit it. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
On 23.08.2010 13:16, Robert Jordan wrote: On 23.08.2010 04:53, Jerry Maine - KF5ADY wrote: I found a discrepency in Mono.Simd.SimdRuntime.AccelMode and it is equivalent access by reflection. I believe this is a bug. Attached is a test for this. I believe there are more cases of this in Mono.Simd Any ideas on how to fix this? Assuming that you want to fix this in mono: you could implement Mono.Simd.SimdRuntime.AccelMode as an icall. This will assure that both fast path and slow path would yield the same value. A patch proposal: http://github.com/robert-j/mono/commit/1107e83b1de0d65a00b2f62d1e88f275f17797e6 Robert ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
Would the c# portion of the patch work on MS .Net? ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
On 23.08.2010 19:24, Jerry Maine wrote: Would the c# portion of the patch work on MS .Net? Dammit! I thought the icall would be ignored by MS.NET because I took care of not invoking it in this case. But icalls are not allowed in assemblies != mscorlib under MS.NET. Unless I'm misguided, the only solution seems to evolve around adding a branch to marshal.cs: mono_marshal_get_runtime_invoke () Schematic code: if (method-klass == Mono.Simd.SimdRuntime) { need_direct_wrapper = TRUE; } The flag will instruct this function to create yet another wrapper around calls to methods of the Mono.Simd.SimdRuntime class. This additional wrapper lets the runtime take the fast path even for reflection calls. Robert ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
On Mon, Aug 23, 2010 at 3:29 PM, Robert Jordan robe...@gmx.net wrote: On 23.08.2010 19:24, Jerry Maine wrote: Would the c# portion of the patch work on MS .Net? Dammit! I thought the icall would be ignored by MS.NET because I took care of not invoking it in this case. But icalls are not allowed in assemblies != mscorlib under MS.NET. Unless I'm misguided, the only solution seems to evolve around adding a branch to marshal.cs: mono_marshal_get_runtime_invoke () Schematic code: if (method-klass == Mono.Simd.SimdRuntime) { need_direct_wrapper = TRUE; } The flag will instruct this function to create yet another wrapper around calls to methods of the Mono.Simd.SimdRuntime class. This additional wrapper lets the runtime take the fast path even for reflection calls. I think it's easier to catch the security exception under MS since its accell mode is None anyway. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
On 23.08.2010 23:13, Rodrigo Kumpera wrote: I think it's easier to catch the security exception under MS since its accell mode is None anyway. I had to move icall's call site outside the .cctor and mark the call site's method as non-inlineable to make this work. Thanks for the hint. http://github.com/robert-j/mono/commit/0450be20f52c64e2788287400fe1eb9ff9be6817 Robert ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd
I have an alternate idea that I'd like to research. It may lead to a cleaner implementation. On Mon, Aug 23, 2010 at 5:01 PM, Robert Jordan robe...@gmx.net wrote: On 23.08.2010 23:13, Rodrigo Kumpera wrote: I think it's easier to catch the security exception under MS since its accell mode is None anyway. I had to move icall's call site outside the .cctor and mark the call site's method as non-inlineable to make this work. Thanks for the hint. http://github.com/robert-j/mono/commit/0450be20f52c64e2788287400fe1eb9ff9be6817 Robert ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
[Mono-dev] mono.simd
I found a discrepency in Mono.Simd.SimdRuntime.AccelMode and it is equivalent access by reflection. I believe this is a bug. Attached is a test for this. I believe there are more cases of this in Mono.Simd Any ideas on how to fix this? using System; using Mono.Simd; namespace simd { class SimdReflectionTest { public static void Main (string[] args) { Console.WriteLine(SimdRuntime.AccelMode); Console.WriteLine(typeof(SimdRuntime).GetProperty(AccelMode).GetValue(null,null)); if (SimdRuntime.AccelMode != (AccelMode) typeof(SimdRuntime).GetProperty(AccelMode).GetValue(null,null)) { Console.WriteLine(SimdRuntime.AccelMode != (AccelMode) typeof(SimdRuntime).GetProperty(\AccelMode).GetValue(null,null)); } } } } ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd AltiVec port
Hello Rodrigo and all, Returning to my old problem which deals with alignment of vector variables. I noticed that on x86 vector locals are aligned at 8-byte boundary instead of 16-byte thus causing to use 'movups' instead of much more efficient 'movaps'. On PowerPC there is no such bug, so I tried to compare their routines for locals' allocation. In 'mini-x86.c', in function 'mono_arch_allocate_vars', I discovered this strange (to me) piece of code: /* * EBP is at alignment 8 % MONO_ARCH_FRAME_ALIGNMENT, so if we * have locals larger than 8 bytes we need to make sure that * they have the appropriate offset. */ if (MONO_ARCH_FRAME_ALIGNMENT 8 locals_stack_align 8) offset += MONO_ARCH_FRAME_ALIGNMENT - sizeof (gpointer) * 2; AFAIU, 'if's condition satisfied when there are vector locals and in that case 'offset' is incremented by 16-4*2=8 bytes thus spoiling the alignment. I tried to remove these lines and didn't notice anything bad, except that alignment got fixed. Moreover, there is no such lines in 'mini-amd64.c'. Can somebody explain to me the meaning of this piece? -- Regards, Sergei Dyshel On Thu, Feb 4, 2010 at 03:59, Rodrigo Kumpera kump...@gmail.com wrote: Hi Sergei, On Tue, Feb 2, 2010 at 6:59 AM, Sergei Dyshel qyron.priv...@gmail.comwrote: Hello all, I'm currently working on PowerPC port of Mono which utilizes AltiVec SIMD instructions. During the development I've encountered an alignment problem: As far as I understood from running Mono's JIT, stack-allocated Mono.Simd.Vector* types are always aligned by 16 byte bound, but global ones aren't (such as static class members). This is not a problem for SSE which has unaligned load/stores but AltiVec doesn't have them. Instead of implementing misaligned loads/stores for AltiVec I think it's better to force alignment in global variables, as it done in the case of stack. No, the JIT doesn't align all Vector types to 16 bytes. There are places, like spill, code that still doesn't do it correctly. Not a lot of work to get there, but still not done. If by global variables you mean statics, then making them properly aligned is possible with some trickery. The only issue alignment issue we can't currently fix are heap objects due to how our GC works. Our new GC might eventually gain the ability to properly align such objects, but this is something for the far future. Can somebody help me with that (e.g. point at relevant places in 'mini-ppc.c')? To fix the alignment of stack variables you need to mess with a bunch of places: -The spill code from mini-codegen.c -The var allocation code in mono_allocate_stack_slots (mini.c) To fix the static storage alignment you need to change the code that allocate the statics area to use the proper alignment. This is the same problem as with objects as it uses a gc routine to allocate the memory blob. Fixing this requires boing deep into the GC, which is not something simple. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd AltiVec port
The way to handle those situations is to have a arch decomposition pass that converts MULPS into a VZERO + MULADD. For bonus points, you can add to the arch peephole code to fuse MULPS + ADDPS. For an example of that, take a look at mini-x86.c / mono_arch_decompose_opts. Rodrigo On Tue, Feb 9, 2010 at 11:57 AM, Sergei Dyshel qyron.priv...@gmail.comwrote: Hi, Now I'm stuck with another problem on PPC. For multiplication of floats Altivec has only a fuse-add instruction which does a*b+c. So in order to implement OP_MULPS I need to assure c==0. The only solution which comes to mind is: XZERO D MULADD D = S1, S2, D Where MULADD is the instruction and D, S1, S2 are ins-dreg, sreg1, sreg2. But this solution won't work with cases in which S1=D or S2=D since D would be zeroed before use. So 2 possibilities remain: 1) Make sure that D S1 and D S2 and then previously-mentioned solution will work. 2) Allocate and additional (vector) register for MULPS and somehow store it inside MonoInst structure. What is the traditional way to do such things? I really need to solve this problem, any help will be greatly appreciated! Thanks, Sergei On Thu, Feb 4, 2010 at 02:59, Rodrigo Kumpera kump...@gmail.com wrote: Hi Sergei, On Tue, Feb 2, 2010 at 6:59 AM, Sergei Dyshel qyron.priv...@gmail.comwrote: Hello all, I'm currently working on PowerPC port of Mono which utilizes AltiVec SIMD instructions. During the development I've encountered an alignment problem: As far as I understood from running Mono's JIT, stack-allocated Mono.Simd.Vector* types are always aligned by 16 byte bound, but global ones aren't (such as static class members). This is not a problem for SSE which has unaligned load/stores but AltiVec doesn't have them. Instead of implementing misaligned loads/stores for AltiVec I think it's better to force alignment in global variables, as it done in the case of stack. No, the JIT doesn't align all Vector types to 16 bytes. There are places, like spill, code that still doesn't do it correctly. Not a lot of work to get there, but still not done. If by global variables you mean statics, then making them properly aligned is possible with some trickery. The only issue alignment issue we can't currently fix are heap objects due to how our GC works. Our new GC might eventually gain the ability to properly align such objects, but this is something for the far future. Can somebody help me with that (e.g. point at relevant places in 'mini-ppc.c')? To fix the alignment of stack variables you need to mess with a bunch of places: -The spill code from mini-codegen.c -The var allocation code in mono_allocate_stack_slots (mini.c) To fix the static storage alignment you need to change the code that allocate the statics area to use the proper alignment. This is the same problem as with objects as it uses a gc routine to allocate the memory blob. Fixing this requires boing deep into the GC, which is not something simple. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd AltiVec port
Hi, Now I'm stuck with another problem on PPC. For multiplication of floats Altivec has only a fuse-add instruction which does a*b+c. So in order to implement OP_MULPS I need to assure c==0. The only solution which comes to mind is: XZERO D MULADD D = S1, S2, D Where MULADD is the instruction and D, S1, S2 are ins-dreg, sreg1, sreg2. But this solution won't work with cases in which S1=D or S2=D since D would be zeroed before use. So 2 possibilities remain: 1) Make sure that D S1 and D S2 and then previously-mentioned solution will work. 2) Allocate and additional (vector) register for MULPS and somehow store it inside MonoInst structure. What is the traditional way to do such things? I really need to solve this problem, any help will be greatly appreciated! Thanks, Sergei On Thu, Feb 4, 2010 at 02:59, Rodrigo Kumpera kump...@gmail.com wrote: Hi Sergei, On Tue, Feb 2, 2010 at 6:59 AM, Sergei Dyshel qyron.priv...@gmail.comwrote: Hello all, I'm currently working on PowerPC port of Mono which utilizes AltiVec SIMD instructions. During the development I've encountered an alignment problem: As far as I understood from running Mono's JIT, stack-allocated Mono.Simd.Vector* types are always aligned by 16 byte bound, but global ones aren't (such as static class members). This is not a problem for SSE which has unaligned load/stores but AltiVec doesn't have them. Instead of implementing misaligned loads/stores for AltiVec I think it's better to force alignment in global variables, as it done in the case of stack. No, the JIT doesn't align all Vector types to 16 bytes. There are places, like spill, code that still doesn't do it correctly. Not a lot of work to get there, but still not done. If by global variables you mean statics, then making them properly aligned is possible with some trickery. The only issue alignment issue we can't currently fix are heap objects due to how our GC works. Our new GC might eventually gain the ability to properly align such objects, but this is something for the far future. Can somebody help me with that (e.g. point at relevant places in 'mini-ppc.c')? To fix the alignment of stack variables you need to mess with a bunch of places: -The spill code from mini-codegen.c -The var allocation code in mono_allocate_stack_slots (mini.c) To fix the static storage alignment you need to change the code that allocate the statics area to use the proper alignment. This is the same problem as with objects as it uses a gc routine to allocate the memory blob. Fixing this requires boing deep into the GC, which is not something simple. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd AltiVec port
Hi Sergei, On Tue, Feb 2, 2010 at 6:59 AM, Sergei Dyshel qyron.priv...@gmail.comwrote: Hello all, I'm currently working on PowerPC port of Mono which utilizes AltiVec SIMD instructions. During the development I've encountered an alignment problem: As far as I understood from running Mono's JIT, stack-allocated Mono.Simd.Vector* types are always aligned by 16 byte bound, but global ones aren't (such as static class members). This is not a problem for SSE which has unaligned load/stores but AltiVec doesn't have them. Instead of implementing misaligned loads/stores for AltiVec I think it's better to force alignment in global variables, as it done in the case of stack. No, the JIT doesn't align all Vector types to 16 bytes. There are places, like spill, code that still doesn't do it correctly. Not a lot of work to get there, but still not done. If by global variables you mean statics, then making them properly aligned is possible with some trickery. The only issue alignment issue we can't currently fix are heap objects due to how our GC works. Our new GC might eventually gain the ability to properly align such objects, but this is something for the far future. Can somebody help me with that (e.g. point at relevant places in 'mini-ppc.c')? To fix the alignment of stack variables you need to mess with a bunch of places: -The spill code from mini-codegen.c -The var allocation code in mono_allocate_stack_slots (mini.c) To fix the static storage alignment you need to change the code that allocate the statics area to use the proper alignment. This is the same problem as with objects as it uses a gc routine to allocate the memory blob. Fixing this requires boing deep into the GC, which is not something simple. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
[Mono-dev] Mono.Simd AltiVec port
Hello all, I'm currently working on PowerPC port of Mono which utilizes AltiVec SIMD instructions. During the development I've encountered an alignment problem: As far as I understood from running Mono's JIT, stack-allocated Mono.Simd.Vector* types are always aligned by 16 byte bound, but global ones aren't (such as static class members). This is not a problem for SSE which has unaligned load/stores but AltiVec doesn't have them. Instead of implementing misaligned loads/stores for AltiVec I think it's better to force alignment in global variables, as it done in the case of stack. Can somebody help me with that (e.g. point at relevant places in 'mini-ppc.c')? Thanks, Sergei Dyshel ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
[Mono-dev] Mono.Simd and Threefish256
As part of my free time, I decided to start down the path to SIMD-ing some cryptography algorithms. As a starter exercise, I took Threefish256 from the SHA-3 submission Skein. The experience was very enlightening, and as I haven't been able to find anything of substance out there about working with Mono.Simd, I thought I'd write some articles about it. I'm posting my experience to my blog in a 5-part series. The first of the posts has already been published, and I'll have the rest ready by the end of the weekend: http://blog.xpdm.us/2009/10/01/skein-threefish-and-mono-simd-part-1/ Thanks to all the folks who've been keeping the Mono.Simd project going. -- Marcus Griep signature.asc Description: This is a digitally signed message part ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.SIMD
Hey, The big issue you're having is that you haven't implemented a SIMD algorithm ;) I spent 15 mins 'optimising' your code and came up with this. Notice that I made everything a SIMD operation. There is no scalar code in the method anymore. This tripled performance as compared to the non-SIMD version. On my machine: -FLOAT 00:00:00.3888930 Color -SIMD 00:00:00.1266820 Mono.Simd.Vector4f You'd want to double check the result just in case I made a mistake with my alterations. Alan. public static Vector4f GradientSIMD() { Vector4f finv_WH = new Vector4f (1.0f / (w*h), 1.0f / (w*h), 1.0f / (w*h), 1.0f / (w*h)); Vector4f ret = new Vector4f(); Vector4f a = new Vector4f(0.0f, 0.0f, 1.0f, 1.0f); a += new Vector4f(0.0f, 1.0f, 0.0f, 1.0f); a += new Vector4f(1.0f, 0.0f, 0.0f, 1.0f); a += new Vector4f(0.5f, 0.5f, 1.0f, 1.0f); //Process operator Vector4f yVec = new Vector4f (h, h, 0, 0); Vector4f yDiff = new Vector4f (-1, -1, 1, 1); for (int y=0; yh; y++) { Vector4f factor = yVec * finv_WH; yVec += yDiff; Vector4f xVec = new Vector4f (w, 0, w, 0); Vector4f xDiff = new Vector4f (-1, 1, -1, 1); for (int x=0; xw; x++) { ret += (a * xVec * factor); xVec += xDiff; } } return ret; } On Fri, Feb 20, 2009 at 8:12 AM, Johann_fxgen jnadalu...@gmail.com wrote: I have done some performance tests of SIMD under windows. Results tests in ms: In MS C 235 (Visual Studio Release Mode With SIMD) In MS C 360 (Visual Studio Release Mode With 4D Float) In Mono C#453 (With Mono SIMD) In Mono C#562 (With Mono 4D Float) In MS C# 609 (Visual Studio With 4D Float) In MS C 672 (Visual Studio Debug Mode) I'm just surprise by difference between C SIMD and mono SIMD version. Is Mono.SIMD under linux speeder than under windows ? Johann. My mono code for test: using Mono.Simd; using System; using Mono; public struct Color { public float r,g,b,a; }; public class TestMonoSIMD { public Color m_pixels; const int w = 4096; const int h = 4096; public static void Main () { //Debug Console.WriteLine(AccelMode: {0}, Mono.Simd.SimdRuntime.AccelMode ); //Without SIMD DateTime start1 = DateTime.Now; Color ret1 = Gradient(); TimeSpan ts1 = DateTime.Now - start1; Console.WriteLine(-FLOAT {0} {1}, ts1, ret1); //With SIMD DateTime start2 = DateTime.Now; Vector4f ret2 = GradientSIMD(); TimeSpan ts2 = DateTime.Now - start2; Console.WriteLine(-SIMD {0} {1}, ts2, ret2); } public static Color Gradient() { float finv_WH = 1.0f / (float)(w*h); Color ret = new Color(); ret.r=ret.g=ret.b=ret.a=0.0f; Color a = new Color(); Color b = new Color(); Color c = new Color(); Color d = new Color(); a.r=0.0f; a.g=0.0f; a.b=1.0f; a.a=1.0f; b.r=0.0f; b.g=1.0f; b.b=0.0f; b.a=1.0f; c.r=1.0f; c.g=0.0f; c.b=0.0f; c.a=1.0f; d.r=0.5f; d.g=0.5f; d.b=1.0f; d.a=1.0f; //Process operator for (int y=0; yh; y++) { for (int x=0; xw; x++) { //Calc percent A,B,C,D float pa = (float)((w-x)* (h-y)) * finv_WH; float pb = (float)((x) * (h-y)) * finv_WH; float pc = (float)((w-x)* (y)) * finv_WH; float pd = (float)((x) * (y)) * finv_WH; float cr= ((a.r*pa) + (b.r*pb) + (c.r*pc) + (d.r*pd)); float cg= ((a.g*pa) + (b.g*pb) + (c.g*pc) + (d.g*pd)); float cb= ((a.b*pa) + (b.b*pb) + (c.b*pc) + (d.b*pd)); float ca= ((a.a*pa) + (b.a*pb) + (c.a*pc) + (d.a*pd));
Re: [Mono-dev] Mono.SIMD
Hey, The C++ code seems very similar to the C# SIMD code, so I don't know what would make the C# version any faster. This question would be best directed at jit guys, who may know what causes the difference. If you want to try speeding up the mono version, you should just use trial and error to see if you can rewrite things so that you can get better performance. For example, unrolling the loop may improve performance noticably. Alan. On Mon, Feb 23, 2009 at 1:16 PM, Johann Nadalutti jnadalu...@gmail.comwrote: Hey, thanks a lot for your modifications. I have now SIMD x3 faster than 4DFloat version ! I make the same code in C++ and It's x3 more faster than Mono.SIMD. I just want to know why and how to optimize my Mono code. What do you use as IDE to develop and debug Mono ? My Visual C++ code for test: class VectorSIMD { public: VectorSIMD(); VectorSIMD(float x, float y, float z, float w); VectorSIMD operator*(const VectorSIMD other) { VectorSIMD r; r.vec = _mm_mul_ps(vec, other.vec); return r; } VectorSIMD operator*(float f) { VectorSIMD r; __m128 b = _mm_load1_ps(f); r.vec = _mm_mul_ps(vec, b); return r; } VectorSIMD operator+(const VectorSIMD other) { VectorSIMD r; r.vec = _mm_add_ps(vec, other.vec); return r; } //Datas union { __m128 vec; struct { float x, y, z, w; }; }; }; VectorSIMD::VectorSIMD() { } VectorSIMD::VectorSIMD(float _x, float _y, float _z, float _w) { x=_x;y=_y; z=_z; w=_w; } VectorSIMD GradientSIMD() { VectorSIMD finv_WH(1.0f / (_W*_H), 1.0f / (_W*_H), 1.0f / (_W*_H), 1.0f / (_W*_H)); VectorSIMD ret(0.0, 0.0, 0.0, 0.0); VectorSIMD a(0.0f, 0.0f, 1.0f, 1.0f); a =a + VectorSIMD(0.0f, 1.0f, 0.0f, 1.0f); a =a + VectorSIMD(1.0f, 0.0f, 0.0f, 1.0f); a =a + VectorSIMD(0.5f, 0.5f, 1.0f, 1.0f); //Process operator VectorSIMD yVec(_H, _H, 0, 0); VectorSIMD yDiff(-1.0f, -1.0f, 1.0f, 1.0f); for (int y=0; y_H; y++) { VectorSIMD factor = yVec * finv_WH; yVec = yVec + yDiff; VectorSIMD xVec(_W, 0, _W, 0); VectorSIMD xDiff(-1.0f, 1.0f, -1.0f, 1.0f); for (int x=0; x_W; x++) { ret=ret+(a*xVec*factor); xVec=xVec+xDiff; } } return ret; } Johann. 2009/2/23 Alan McGovern alan.mcgov...@gmail.com Hey, The big issue you're having is that you haven't implemented a SIMD algorithm ;) I spent 15 mins 'optimising' your code and came up with this. Notice that I made everything a SIMD operation. There is no scalar code in the method anymore. This tripled performance as compared to the non-SIMD version. On my machine: -FLOAT 00:00:00.3888930 Color -SIMD 00:00:00.1266820 Mono.Simd.Vector4f You'd want to double check the result just in case I made a mistake with my alterations. Alan. public static Vector4f GradientSIMD() { Vector4f finv_WH = new Vector4f (1.0f / (w*h), 1.0f / (w*h), 1.0f / (w*h), 1.0f / (w*h)); Vector4f ret = new Vector4f(); Vector4f a = new Vector4f(0.0f, 0.0f, 1.0f, 1.0f); a += new Vector4f(0.0f, 1.0f, 0.0f, 1.0f); a += new Vector4f(1.0f, 0.0f, 0.0f, 1.0f); a += new Vector4f(0.5f, 0.5f, 1.0f, 1.0f); //Process operator Vector4f yVec = new Vector4f (h, h, 0, 0); Vector4f yDiff = new Vector4f (-1, -1, 1, 1); for (int y=0; yh; y++) { Vector4f factor = yVec * finv_WH; yVec += yDiff; Vector4f xVec = new Vector4f (w, 0, w, 0); Vector4f xDiff = new Vector4f (-1, 1, -1, 1); for (int x=0; xw; x++) { ret += (a * xVec * factor); xVec += xDiff; } } return ret; } On Fri, Feb 20, 2009 at 8:12 AM, Johann_fxgen jnadalu...@gmail.comwrote: I have done some performance tests of SIMD under windows. Results tests in ms: In MS C 235 (Visual Studio Release Mode With SIMD) In MS C 360 (Visual Studio Release Mode With 4D Float) In Mono C#453 (With Mono SIMD) In Mono C#562 (With Mono 4D Float) In MS C# 609 (Visual Studio With 4D Float) In MS C 672 (Visual Studio Debug Mode) I'm just surprise by difference between C SIMD and mono SIMD version. Is Mono.SIMD under linux speeder than under windows ? Johann. My mono code for test: using Mono.Simd; using System; using Mono; public struct Color { public float r,g,b,a; }; public class TestMonoSIMD { public Color m_pixels; const int w = 4096;
[Mono-dev] Mono.SIMD
I have done some performance tests of SIMD under windows. Results tests in ms: In MS C 235 (Visual Studio Release Mode With SIMD) In MS C 360 (Visual Studio Release Mode With 4D Float) In Mono C#453 (With Mono SIMD) In Mono C#562 (With Mono 4D Float) In MS C# 609 (Visual Studio With 4D Float) In MS C 672 (Visual Studio Debug Mode) I'm just surprise by difference between C SIMD and mono SIMD version. Is Mono.SIMD under linux speeder than under windows ? Johann. My mono code for test: using Mono.Simd; using System; using Mono; public struct Color { public float r,g,b,a; }; public class TestMonoSIMD { public Color m_pixels; const int w = 4096; const int h = 4096; public static void Main () { //Debug Console.WriteLine(AccelMode: {0}, Mono.Simd.SimdRuntime.AccelMode ); //Without SIMD DateTime start1 = DateTime.Now; Color ret1 = Gradient(); TimeSpan ts1 = DateTime.Now - start1; Console.WriteLine(-FLOAT {0} {1}, ts1, ret1); //With SIMD DateTime start2 = DateTime.Now; Vector4f ret2 = GradientSIMD(); TimeSpan ts2 = DateTime.Now - start2; Console.WriteLine(-SIMD {0} {1}, ts2, ret2); } public static Color Gradient() { float finv_WH = 1.0f / (float)(w*h); Color ret = new Color(); ret.r=ret.g=ret.b=ret.a=0.0f; Color a = new Color(); Color b = new Color(); Color c = new Color(); Color d = new Color(); a.r=0.0f; a.g=0.0f; a.b=1.0f; a.a=1.0f; b.r=0.0f; b.g=1.0f; b.b=0.0f; b.a=1.0f; c.r=1.0f; c.g=0.0f; c.b=0.0f; c.a=1.0f; d.r=0.5f; d.g=0.5f; d.b=1.0f; d.a=1.0f; //Process operator for (int y=0; yh; y++) { for (int x=0; xw; x++) { //Calc percent A,B,C,D float pa = (float)((w-x)* (h-y)) * finv_WH; float pb = (float)((x) * (h-y)) * finv_WH; float pc = (float)((w-x)* (y)) * finv_WH; float pd = (float)((x) * (y)) * finv_WH; float cr= ((a.r*pa) + (b.r*pb) + (c.r*pc) + (d.r*pd)); float cg= ((a.g*pa) + (b.g*pb) + (c.g*pc) + (d.g*pd)); float cb= ((a.b*pa) + (b.b*pb) + (c.b*pc) + (d.b*pd)); float ca= ((a.a*pa) + (b.a*pb) + (c.a*pc) + (d.a*pd)); ret.r+=cr; ret.g+=cg; ret.b+=cb; ret.a+=ca; } } return ret; } public static Vector4f GradientSIMD() { float finv_WH = 1.0f / (float)(w*h); Vector4f ret = new Vector4f(0.0f, 0.0f, 0.0f, 0.0f); Vector4f a = new Vector4f(0.0f, 0.0f, 1.0f, 1.0f); Vector4f b = new Vector4f(0.0f, 1.0f, 0.0f, 1.0f); Vector4f c = new Vector4f(1.0f, 0.0f, 0.0f, 1.0f); Vector4f d = new Vector4f(0.5f, 0.5f, 1.0f, 1.0f); //Process operator Vector4f p = new Vector4f(); Vector4f r = new Vector4f(); for (int y=0; yh; y++) { for (int x=0; xw; x++) { //Calc percent A,B,C,D p.X = (float)((w-x) * (h-y)) * finv_WH; p.Y = (float)((x) * (h-y)) * finv_WH; p.Z = (float)((w-x) * (y)) * finv_WH; p.W = (float)((x) *
Re: [Mono-dev] Mono.Simd: Accelerated methods analysis
Oh, BTW, there are 2 issues with your program. The following code is wrong mi.GetParameters() [i].GetType(), it should be mi.GetParameters() [i].ParameterType otherwise you'll be querying for ParameterInfo class instead of what you want. The other one is minor, in that some functions might not report as accelerated because you're running it on an old machine without support. Cheers, Rodrigo On Wed, Dec 10, 2008 at 10:21 AM, Rodrigo Kumpera [EMAIL PROTECTED] wrote: Hi Bart, Right now the only methods that are not accelerated are indexers, if any method is missing from this list, it's a bug. Cheers, Rodrigo On Sat, Dec 6, 2008 at 11:53 PM, Bart Masschelein [EMAIL PROTECTED]wrote: Hi all, I've written aprogram that uses reflection to give a list of relevant methods in the Mono.Simd, and reports whether they are accelerated or not (see below). This small program might be of interest to others, to see how well their processor behave. There are methods that have overloaded, for which I should give the signature, but I'm a bit lost in how this signature should look like. I tried to convert the ParameterInfo[] of the methods to Type[], as required by the IsMethodAccelerated method, but this gives erroneous results. Is it only the parameters list, or is there more to it? I thought of removing the overloaded methods (see list), but I guess I might risk to remove relevant methods as well. The overloaded methods are mainly op_Explicit, LoadAligned, StoreAligned, and the PrefetchXxx methods. Are these relevant to show up in such a list? Anyway, I'm quite thrilled to see that almost all of the methods are accelerated :-). Bart using System; using Mono.Simd; using System.Reflection; namespace AcceleratedMethods { class MainClass { public static void Main(string[] args) { // Change to your location of Mono.Simd string monoSimdLocation = @/Users/masschel/local/mono/ lib/mono/2.0/Mono.Simd.dll; Assembly assembly = Assembly.LoadFile(monoSimdLocation); foreach(Type type in assembly.GetTypes()) { string typeName = type.Name; if (typeName.Length=6 typeName.Substring(0,6) == Vector) { Console.WriteLine(Type {0}, type.Name); foreach(MethodInfo mi in type.GetMethods()) { string methodName = mi.Name; bool ctu = methodName != Equals methodName != GetHashCode methodName != ToString methodName != GetType (methodName.Length=4 methodName.Substring(0, 4) != get_ methodName.Substring(0, 4) != set_); if (ctu) { try { Console.WriteLine( Method {0} {1}, mi.Name, SimdRuntime.IsMethodAccelerated(type, mi.Name)); } // Overloaded methods catch (System.Reflection.AmbiguousMatchException amme) { Type[] types = new Type[mi.GetParameters().Length]; for(int i = 0; i mi.GetParameters().Length; i++) { types[i] = mi.GetParameters() [i].GetType(); } Console.WriteLine( AmbiguousMatchException for method {0} {1}, mi.Name, SimdRuntime.IsMethodAccelerated(type, mi.Name, types)); } } } } } } } } ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd: Accelerated methods analysis
The following code is wrong mi.GetParameters() [i].GetType(), it should be mi.GetParameters() [i].ParameterType otherwise you'll be querying for ParameterInfo class instead of what you want. Thanks, that was what I was looking for, updated program below. The other one is minor, in that some functions might not report as accelerated because you're running it on an old machine without support. That is exactly what this program is supposed to do: see which functions are accelerated on a certain machine, and which not, to know if I can expect an increase or not, or rather choose for another option, without having to investigate this for each method seperately. I just have to run this program ones, and keep the list at hand. As an example, the output on my MacBookPro is added to the end. Sorry for the lengthy mail ;-). Bart // Main.cs created with MonoDevelop // User: masschel at 15:51 11/21/2008 // // To change standard headers go to Edit-Preferences-Coding- Standard Headers // using System; using Mono.Simd; using System.Reflection; namespace AcceleratedMethods { class MainClass { public static void Main(string[] args) { // Change to your location of Mono.Simd string monoSimdLocation = @/Users/masschel/local/mono/ lib/mono/2.0/Mono.Simd.dll; Assembly assembly = Assembly.LoadFile(monoSimdLocation); foreach(Type type in assembly.GetTypes()) { string typeName = type.Name; if (typeName.Length=6 typeName.Substring(0,6) == Vector) { Console.WriteLine(Type {0}, type.Name); foreach(MethodInfo mi in type.GetMethods()) { string methodName = mi.Name; bool ctu = methodName != Equals methodName != GetHashCode methodName != ToString methodName != GetType /* (methodName.Length=4 methodName.Substring(0, 4) != get_ methodName.Substring(0, 4) != set_)*/; if (ctu) { Type[] types = new Type[mi.GetParameters().Length]; Console.Write( Method {0}(, mi.Name); for(int i = 0; i mi.GetParameters().Length; i++) { types[i] = mi.GetParameters() [i].ParameterType; if (i+1mi.GetParameters().Length) Console.Write({0}, , types[i].Name); else Console.Write({0}, types[i].Name); } Console.WriteLine():{0} accelerated: {1}, mi.ReturnParameter, SimdRuntime.IsMethodAccelerated(type, mi.Name, types)); } } } } } } } Type Vector2d Method AndNot(Vector2d, Vector2d):Vector2d accelerated: True Method HorizontalAdd(Vector2d, Vector2d):Vector2d accelerated: True Method AddSub(Vector2d, Vector2d):Vector2d accelerated: True Method HorizontalSub(Vector2d, Vector2d):Vector2d accelerated: True Method InterleaveHigh(Vector2d, Vector2d):Vector2d accelerated: True Method InterleaveLow(Vector2d, Vector2d):Vector2d accelerated: True Method CompareEqual(Vector2d, Vector2d):Vector2d accelerated: True Method CompareLessThan(Vector2d, Vector2d):Vector2d accelerated: True Method CompareLessEqual(Vector2d, Vector2d):Vector2d accelerated: True Method CompareUnordered(Vector2d, Vector2d):Vector2d accelerated: True Method CompareNotEqual(Vector2d, Vector2d):Vector2d accelerated: True Method CompareNotLessThan(Vector2d, Vector2d):Vector2d accelerated: True Method CompareNotLessEqual(Vector2d, Vector2d):Vector2d accelerated: True Method CompareOrdered(Vector2d, Vector2d):Vector2d accelerated: True Method Duplicate(Vector2d):Vector2d accelerated: True Method LoadAligned(Vector2d):Vector2d accelerated: True Method StoreAligned(Vector2d, Vector2d):Void accelerated: True Method LoadAligned(Vector2d*):Vector2d accelerated: True Method StoreAligned(Vector2d*, Vector2d):Void accelerated: True Method PrefetchTemporalAllCacheLevels(Vector2d):Void accelerated: True Method PrefetchTemporal1stLevelCache(Vector2d):Void accelerated: True Method PrefetchTemporal2ndLevelCache(Vector2d):Void accelerated: True Method PrefetchNonTemporal(Vector2d):Void accelerated: True Method PrefetchTemporalAllCacheLevels(Vector2d*):Void accelerated: True Method PrefetchTemporal1stLevelCache(Vector2d*):Void accelerated: True Method
[Mono-dev] Mono.Simd: Accelerated methods analysis
Hi all, I've written aprogram that uses reflection to give a list of relevant methods in the Mono.Simd, and reports whether they are accelerated or not (see below). This small program might be of interest to others, to see how well their processor behave. There are methods that have overloaded, for which I should give the signature, but I'm a bit lost in how this signature should look like. I tried to convert the ParameterInfo[] of the methods to Type[], as required by the IsMethodAccelerated method, but this gives erroneous results. Is it only the parameters list, or is there more to it? I thought of removing the overloaded methods (see list), but I guess I might risk to remove relevant methods as well. The overloaded methods are mainly op_Explicit, LoadAligned, StoreAligned, and the PrefetchXxx methods. Are these relevant to show up in such a list? Anyway, I'm quite thrilled to see that almost all of the methods are accelerated :-). Bart using System; using Mono.Simd; using System.Reflection; namespace AcceleratedMethods { class MainClass { public static void Main(string[] args) { // Change to your location of Mono.Simd string monoSimdLocation = @/Users/masschel/local/mono/ lib/mono/2.0/Mono.Simd.dll; Assembly assembly = Assembly.LoadFile(monoSimdLocation); foreach(Type type in assembly.GetTypes()) { string typeName = type.Name; if (typeName.Length=6 typeName.Substring(0,6) == Vector) { Console.WriteLine(Type {0}, type.Name); foreach(MethodInfo mi in type.GetMethods()) { string methodName = mi.Name; bool ctu = methodName != Equals methodName != GetHashCode methodName != ToString methodName != GetType (methodName.Length=4 methodName.Substring(0, 4) != get_ methodName.Substring(0, 4) != set_); if (ctu) { try { Console.WriteLine( Method {0} {1}, mi.Name, SimdRuntime.IsMethodAccelerated(type, mi.Name)); } // Overloaded methods catch (System.Reflection.AmbiguousMatchException amme) { Type[] types = new Type[mi.GetParameters().Length]; for(int i = 0; i mi.GetParameters().Length; i++) { types[i] = mi.GetParameters() [i].GetType(); } Console.WriteLine( AmbiguousMatchException for method {0} {1}, mi.Name, SimdRuntime.IsMethodAccelerated(type, mi.Name, types)); } } } } } } } } ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd sugestions
Rodrigo Kumpera wrote: On Wed, Nov 19, 2008 at 4:23 PM, crashfourit [EMAIL PROTECTED] wrote: It would be nice to have the vector* have a constructor that takes in only one argument and fills all spots in the vector* with the same value. Like... Vector4f vector = new Vector4f(1); Second... I can really see someone doing this to use mono.simd in already established code base. [StructLayout( LayoutKind.Sequential, Pack = 0, Size = 16 )] class Vector4 { /* some user defined vector methods. .. */ private static explicit operator Vector4f(Vector4 v){ unsafe { Vector4f* p = (Vector4f*) v; return *p; } } private static explicit operator Vector4(Vector4f v){ unsafe { Vector4* p = (Vector4*) v; return *p; } } } Is it possible to accelerate these user defined operator overloads? Or do I have to resort to C# style unions? -- This code will be inlined and work like a charm. But I recommend coding it in the following way to squeeze the maximum performance out of it: public static unsafe Vector4f AsVector(ref Vector4 v){ fixed (Vector4 *f = v) { return *(Vector4f*)f; } } This will avoid the extra copy of passing the valuetype by value on stack and will inline straight to a load from the load/array element to a simd machine register. Pretty cool, isn't it? ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list How will the jit engine handle this? public static unsafe Vector4 AsVector4(ref Vector4f v){ fixed (Vector4f *f = v) { return *(Vector4*)f; } } -- View this message in context: http://www.nabble.com/mono.simd-sugestions-tp20586082p20612136.html Sent from the Mono - Dev mailing list archive at Nabble.com. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd sugestions
The JIT will generate reasonable code. It's on our plans to give atention on having a good integration story with existing code. On Thu, Nov 20, 2008 at 9:15 PM, crashfourit [EMAIL PROTECTED] wrote: How will the jit engine handle this? public static unsafe Vector4 AsVector4(ref Vector4f v){ fixed (Vector4f *f = v) { return *(Vector4*)f; } } ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
[Mono-dev] mono.simd sugestions
It would be nice to have the vector* have a constructor that takes in only one argument and fills all spots in the vector* with the same value. Like... Vector4f vector = new Vector4f(1); Second... I can really see someone doing this to use mono.simd in already established code base. [StructLayout( LayoutKind.Sequential, Pack = 0, Size = 16 )] class Vector4 { /* some user defined vector methods. .. */ private static explicit operator Vector4f(Vector4 v){ unsafe { Vector4f* p = (Vector4f*) v; return *p; } } private static explicit operator Vector4(Vector4f v){ unsafe { Vector4* p = (Vector4*) v; return *p; } } } Is it possible to accelerate these user defined operator overloads? Or do I have to resort to C# style unions? -- View this message in context: http://www.nabble.com/mono.simd-sugestions-tp20586082p20586082.html Sent from the Mono - Dev mailing list archive at Nabble.com. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] mono.simd sugestions
On Wed, Nov 19, 2008 at 4:23 PM, crashfourit [EMAIL PROTECTED] wrote: It would be nice to have the vector* have a constructor that takes in only one argument and fills all spots in the vector* with the same value. Like... Vector4f vector = new Vector4f(1); Second... I can really see someone doing this to use mono.simd in already established code base. [StructLayout( LayoutKind.Sequential, Pack = 0, Size = 16 )] class Vector4 { /* some user defined vector methods. .. */ private static explicit operator Vector4f(Vector4 v){ unsafe { Vector4f* p = (Vector4f*) v; return *p; } } private static explicit operator Vector4(Vector4f v){ unsafe { Vector4* p = (Vector4*) v; return *p; } } } Is it possible to accelerate these user defined operator overloads? Or do I have to resort to C# style unions? -- This code will be inlined and work like a charm. But I recommend coding it in the following way to squeeze the maximum performance out of it: public static unsafe Vector4f AsVector(ref Vector4 v){ fixed (Vector4 *f = v) { return *(Vector4f*)f; } } This will avoid the extra copy of passing the valuetype by value on stack and will inline straight to a load from the load/array element to a simd machine register. Pretty cool, isn't it? ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd - slower than the normal implementation
Hi Alan, There a couple of issues with your code, let me get on them: -Until recently (last night), getters were not accelerated, which causes a significant slowdown. I fixed this in r118899. The generated code is not as good as it could be, but this will be fixed eventually. -Setters are still not accelerated, I'll work on this next week, so until then your code will suffer. -Once you use a single non-accelerated method on a Vector variable all operations on it will be slower due to how our JIT works - they still use sse instructions, but with a performance penalty. -Getters and setter are a hint of ill vectorized code. The last part of your unsafe code should use temps for the intermediate results. -In the unsafe case you should use a Vector4ui store instead of extracting each element. -For the safe case we still miss proper integration with arrays, in the form of methods to extract and store vectors from them. Your code looks a bit strange, the Vector4ui constructor indexes in particular. Have you checked that the output of the 3 methods are the same? I'll work on the Mono.Simd issues next week, getting setters to be accelerated, some methods to better integrate with arrays and other things like element extractors. Rodrigo On Sat, Nov 15, 2008 at 12:13 AM, Alan McGovern [EMAIL PROTECTED]wrote: I found a bit of code in the SHA1 implementation which i thought was ideal for SIMD optimisations. However, unless i resort to unsafe code, it's actually substantially slower! I've attached three implementations of the method here. The original, the safe SIMD and the unsafe SIMD. The runtimes are as follows: Original: 600ms Unsafe Simd: 450ms Safe Simd: 1700ms Also, the method is always called with a uint[] of length 80. Is this just the wrong place to be using simd? It seemed ideal because i need 75% less XOR's. If anyone has an ideas on whether SIMD could actually be useful for this case or not, let me know. Thanks, Alan. The original code is: private static void FillBuff(uint[] buff) { uint val; for (int i = 16; i 80; i += 8) { val = buff[i - 3] ^ buff[i - 8] ^ buff[i - 14] ^ buff[i - 16]; buff[i] = (val 1) | (val 31); val = buff[i - 2] ^ buff[i - 7] ^ buff[i - 13] ^ buff[i - 15]; buff[i + 1] = (val 1) | (val 31); val = buff[i - 1] ^ buff[i - 6] ^ buff[i - 12] ^ buff[i - 14]; buff[i + 2] = (val 1) | (val 31); val = buff[i + 0] ^ buff[i - 5] ^ buff[i - 11] ^ buff[i - 13]; buff[i + 3] = (val 1) | (val 31); val = buff[i + 1] ^ buff[i - 4] ^ buff[i - 10] ^ buff[i - 12]; buff[i + 4] = (val 1) | (val 31); val = buff[i + 2] ^ buff[i - 3] ^ buff[i - 9] ^ buff[i - 11]; buff[i + 5] = (val 1) | (val 31); val = buff[i + 3] ^ buff[i - 2] ^ buff[i - 8] ^ buff[i - 10]; buff[i + 6] = (val 1) | (val 31); val = buff[i + 4] ^ buff[i - 1] ^ buff[i - 7] ^ buff[i - 9]; buff[i + 7] = (val 1) | (val 31); } } The unsafe SIMD code is: public unsafe static void FillBuff(uint[] buffb) { fixed (uint* buff = buffb) { Vector4ui e; for (int t = 16; t buffb.Length; t += 4) { e = *((Vector4ui*)(buff [t-16])) ^ *((Vector4ui*)(buff [t-14])) ^ *((Vector4ui*)(buff [t- 8])) ^ *((Vector4ui*)(buff [t- 3])); e.W ^= buff[t]; buff[t] = (e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = (e.W 1) | (e.W 31) ^ ((e.X 2) | (e.X 30)); } } } The safe simd code is: public static void FillBuff(uint[] buff) { Vector4ui e; for (int t = 16; t buff.Length; t += 4) { e = new Vector4ui (buff [t-16],buff [t-15],buff [t-14],buff [t-13]) ^ new Vector4ui (buff [t-14],buff [t-13],buff [t-12],buff [t-11]) ^ new Vector4ui (buff [t-8], buff [t-7], buff [t-6], buff [t-5]) ^ new Vector4ui (buff [t-3], buff [t-2], buff [t-1], buff [t-0]); e.W ^= buff[t]; buff[t] =(e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = (e.W 1) | (e.W 31) ^ ((e.X 2) | (e.X 30)); } } ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com
Re: [Mono-dev] Mono.Simd - slower than the normal implementation
Hey, On Sat, Nov 15, 2008 at 3:50 PM, Rodrigo Kumpera [EMAIL PROTECTED] wrote: Hi Alan, -Getters and setter are a hint of ill vectorized code. In this particular scenario, I'm not sure how i can get rid of the use of getters/setters unless I use even more unsafe code. I don't know whether it's feasible or not, but it'd be great to be able to use this API without having to use unsafe code. At the moment, I don't think it's really possible to use this API without getters and setters. The last part of your unsafe code should use temps for the intermediate results. Do you mean that I should copy the vector 'e', which i got from XOR'ing my values, into another Vector4ui using the store operation? Then I should do my bitshifting/storing into uint[] from that one? -For the safe case we still miss proper integration with arrays, in the form of methods to extract and store vectors from them. I was thinking that the API could expose something like: Vector4ui.Create (uint[] array, int offset, ref Vector4ui result) which could be changed into: result = *((Vector4ui*)array [offset]); Though I'm sure you have ideas already on this ;) A similar method for storing the result into a uint[] would be great too. Your code looks a bit strange, the Vector4ui constructor indexes in particular. Have you checked that the output of the 3 methods are the same? Yes, there is a bug in my implementation there, I left out a bracket when setting the value of buff[t+3]. There should be an additional bracket around (e.W 1) | (e.W 31). Other than that, the implementation is correct. I've pasted the correct implementation of the unsafe and safe SIMD versions below. Just for reference purposes. I'll work on the Mono.Simd issues next week, getting setters to be accelerated, some methods to better integrate with arrays and other things like element extractors. Great stuff. Give me a shout when you've done that and I'll try to improve the above implementation. Though if you have time to spare while writing the SIMD code, you could take a look at it yourself ;) Thanks, Alan. Reference implementations (non buggy ;) ): public static void FillBuffSafe(uint[] buff) { for (int t = 16; t buff.Length; t += 4) { Vector4ui e = new Vector4ui(buff[t - 3], buff[t - 2], buff[t - 1], buff[t - 0]) ^ new Vector4ui(buff[t - 8], buff[t - 7], buff[t - 6], buff[t - 5]) ^ new Vector4ui(buff[t - 14], buff[t - 13], buff[t - 12], buff[t - 11]) ^ new Vector4ui(buff[t - 16], buff[t - 15], buff[t - 14], buff[t - 13]); e.W ^= buff[t]; buff[t] = (e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = ((e.W 1) | (e.W 31)) ^ ((e.X 2) | (e.X 30)); } } public unsafe static void FillBuffUnsafe(uint[] buffb) { fixed (uint* buff = buffb) { for (int t = 16; t buffb.Length; t += 4) { Vector4ui e = *((Vector4ui*)buff[t - 3]) ^ *((Vector4ui*)buff[t - 8]) ^ *((Vector4ui*)buff[t - 14]) ^ *((Vector4ui*)buff[t - 16]); e.W ^= buff[t]; buff[t] = (e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = ((e.W 1) | (e.W 31)) ^ ((e.X 2) | (e.X 30)); } } } Rodrigo On Sat, Nov 15, 2008 at 12:13 AM, Alan McGovern [EMAIL PROTECTED] wrote: I found a bit of code in the SHA1 implementation which i thought was ideal for SIMD optimisations. However, unless i resort to unsafe code, it's actually substantially slower! I've attached three implementations of the method here. The original, the safe SIMD and the unsafe SIMD. The runtimes are as follows: Original: 600ms Unsafe Simd: 450ms Safe Simd: 1700ms Also, the method is always called with a uint[] of length 80. Is this just the wrong place to be using simd? It seemed ideal because i need 75% less XOR's. If anyone has an ideas on whether SIMD could actually be useful for this case or not, let me know. Thanks, Alan. The original code is: private static void FillBuff(uint[] buff) { uint val; for (int i = 16; i 80; i += 8) { val = buff[i - 3] ^ buff[i - 8] ^ buff[i - 14] ^ buff[i - 16]; buff[i] = (val 1) | (val 31); val = buff[i - 2] ^ buff[i - 7] ^ buff[i - 13] ^ buff[i - 15]; buff[i + 1] = (val 1) | (val 31); val = buff[i - 1] ^ buff[i - 6] ^ buff[i - 12] ^ buff[i - 14]; buff[i + 2] = (val
Re: [Mono-dev] Mono.Simd - slower than the normal implementation
Here's my benchmarking file anyway, it may prove useful. Alan. On Sun, Nov 16, 2008 at 2:37 AM, Alan McGovern [EMAIL PROTECTED] wrote: Hey, On Sat, Nov 15, 2008 at 3:50 PM, Rodrigo Kumpera [EMAIL PROTECTED] wrote: Hi Alan, -Getters and setter are a hint of ill vectorized code. In this particular scenario, I'm not sure how i can get rid of the use of getters/setters unless I use even more unsafe code. I don't know whether it's feasible or not, but it'd be great to be able to use this API without having to use unsafe code. At the moment, I don't think it's really possible to use this API without getters and setters. The last part of your unsafe code should use temps for the intermediate results. Do you mean that I should copy the vector 'e', which i got from XOR'ing my values, into another Vector4ui using the store operation? Then I should do my bitshifting/storing into uint[] from that one? -For the safe case we still miss proper integration with arrays, in the form of methods to extract and store vectors from them. I was thinking that the API could expose something like: Vector4ui.Create (uint[] array, int offset, ref Vector4ui result) which could be changed into: result = *((Vector4ui*)array [offset]); Though I'm sure you have ideas already on this ;) A similar method for storing the result into a uint[] would be great too. Your code looks a bit strange, the Vector4ui constructor indexes in particular. Have you checked that the output of the 3 methods are the same? Yes, there is a bug in my implementation there, I left out a bracket when setting the value of buff[t+3]. There should be an additional bracket around (e.W 1) | (e.W 31). Other than that, the implementation is correct. I've pasted the correct implementation of the unsafe and safe SIMD versions below. Just for reference purposes. I'll work on the Mono.Simd issues next week, getting setters to be accelerated, some methods to better integrate with arrays and other things like element extractors. Great stuff. Give me a shout when you've done that and I'll try to improve the above implementation. Though if you have time to spare while writing the SIMD code, you could take a look at it yourself ;) Thanks, Alan. Reference implementations (non buggy ;) ): public static void FillBuffSafe(uint[] buff) { for (int t = 16; t buff.Length; t += 4) { Vector4ui e = new Vector4ui(buff[t - 3], buff[t - 2], buff[t - 1], buff[t - 0]) ^ new Vector4ui(buff[t - 8], buff[t - 7], buff[t - 6], buff[t - 5]) ^ new Vector4ui(buff[t - 14], buff[t - 13], buff[t - 12], buff[t - 11]) ^ new Vector4ui(buff[t - 16], buff[t - 15], buff[t - 14], buff[t - 13]); e.W ^= buff[t]; buff[t] = (e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = ((e.W 1) | (e.W 31)) ^ ((e.X 2) | (e.X 30)); } } public unsafe static void FillBuffUnsafe(uint[] buffb) { fixed (uint* buff = buffb) { for (int t = 16; t buffb.Length; t += 4) { Vector4ui e = *((Vector4ui*)buff[t - 3]) ^ *((Vector4ui*)buff[t - 8]) ^ *((Vector4ui*)buff[t - 14]) ^ *((Vector4ui*)buff[t - 16]); e.W ^= buff[t]; buff[t] = (e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = ((e.W 1) | (e.W 31)) ^ ((e.X 2) | (e.X 30)); } } } Rodrigo On Sat, Nov 15, 2008 at 12:13 AM, Alan McGovern [EMAIL PROTECTED] wrote: I found a bit of code in the SHA1 implementation which i thought was ideal for SIMD optimisations. However, unless i resort to unsafe code, it's actually substantially slower! I've attached three implementations of the method here. The original, the safe SIMD and the unsafe SIMD. The runtimes are as follows: Original: 600ms Unsafe Simd: 450ms Safe Simd: 1700ms Also, the method is always called with a uint[] of length 80. Is this just the wrong place to be using simd? It seemed ideal because i need 75% less XOR's. If anyone has an ideas on whether SIMD could actually be useful for this case or not, let me know. Thanks, Alan. The original code is: private static void FillBuff(uint[] buff) { uint val; for (int i = 16; i 80; i += 8) { val = buff[i - 3] ^ buff[i - 8] ^ buff[i - 14] ^ buff[i - 16]; buff[i] = (val 1) | (val 31); val = buff[i - 2] ^ buff[i - 7] ^ buff[i - 13] ^ buff[i -
[Mono-dev] Mono.Simd - slower than the normal implementation
I found a bit of code in the SHA1 implementation which i thought was ideal for SIMD optimisations. However, unless i resort to unsafe code, it's actually substantially slower! I've attached three implementations of the method here. The original, the safe SIMD and the unsafe SIMD. The runtimes are as follows: Original: 600ms Unsafe Simd: 450ms Safe Simd: 1700ms Also, the method is always called with a uint[] of length 80. Is this just the wrong place to be using simd? It seemed ideal because i need 75% less XOR's. If anyone has an ideas on whether SIMD could actually be useful for this case or not, let me know. Thanks, Alan. The original code is: private static void FillBuff(uint[] buff) { uint val; for (int i = 16; i 80; i += 8) { val = buff[i - 3] ^ buff[i - 8] ^ buff[i - 14] ^ buff[i - 16]; buff[i] = (val 1) | (val 31); val = buff[i - 2] ^ buff[i - 7] ^ buff[i - 13] ^ buff[i - 15]; buff[i + 1] = (val 1) | (val 31); val = buff[i - 1] ^ buff[i - 6] ^ buff[i - 12] ^ buff[i - 14]; buff[i + 2] = (val 1) | (val 31); val = buff[i + 0] ^ buff[i - 5] ^ buff[i - 11] ^ buff[i - 13]; buff[i + 3] = (val 1) | (val 31); val = buff[i + 1] ^ buff[i - 4] ^ buff[i - 10] ^ buff[i - 12]; buff[i + 4] = (val 1) | (val 31); val = buff[i + 2] ^ buff[i - 3] ^ buff[i - 9] ^ buff[i - 11]; buff[i + 5] = (val 1) | (val 31); val = buff[i + 3] ^ buff[i - 2] ^ buff[i - 8] ^ buff[i - 10]; buff[i + 6] = (val 1) | (val 31); val = buff[i + 4] ^ buff[i - 1] ^ buff[i - 7] ^ buff[i - 9]; buff[i + 7] = (val 1) | (val 31); } } The unsafe SIMD code is: public unsafe static void FillBuff(uint[] buffb) { fixed (uint* buff = buffb) { Vector4ui e; for (int t = 16; t buffb.Length; t += 4) { e = *((Vector4ui*)(buff [t-16])) ^ *((Vector4ui*)(buff [t-14])) ^ *((Vector4ui*)(buff [t- 8])) ^ *((Vector4ui*)(buff [t- 3])); e.W ^= buff[t]; buff[t] = (e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = (e.W 1) | (e.W 31) ^ ((e.X 2) | (e.X 30)); } } } The safe simd code is: public static void FillBuff(uint[] buff) { Vector4ui e; for (int t = 16; t buff.Length; t += 4) { e = new Vector4ui (buff [t-16],buff [t-15],buff [t-14],buff [t-13]) ^ new Vector4ui (buff [t-14],buff [t-13],buff [t-12],buff [t-11]) ^ new Vector4ui (buff [t-8], buff [t-7], buff [t-6], buff [t-5]) ^ new Vector4ui (buff [t-3], buff [t-2], buff [t-1], buff [t-0]); e.W ^= buff[t]; buff[t] =(e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = (e.W 1) | (e.W 31) ^ ((e.X 2) | (e.X 30)); } } ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd - slower than the normal implementation
I forgot to mention that I'm on a 1.86GHZ core2duo and i was running with --optimize=simd. Alan. On Sat, Nov 15, 2008 at 2:13 AM, Alan McGovern [EMAIL PROTECTED] wrote: I found a bit of code in the SHA1 implementation which i thought was ideal for SIMD optimisations. However, unless i resort to unsafe code, it's actually substantially slower! I've attached three implementations of the method here. The original, the safe SIMD and the unsafe SIMD. The runtimes are as follows: Original: 600ms Unsafe Simd: 450ms Safe Simd: 1700ms Also, the method is always called with a uint[] of length 80. Is this just the wrong place to be using simd? It seemed ideal because i need 75% less XOR's. If anyone has an ideas on whether SIMD could actually be useful for this case or not, let me know. Thanks, Alan. The original code is: private static void FillBuff(uint[] buff) { uint val; for (int i = 16; i 80; i += 8) { val = buff[i - 3] ^ buff[i - 8] ^ buff[i - 14] ^ buff[i - 16]; buff[i] = (val 1) | (val 31); val = buff[i - 2] ^ buff[i - 7] ^ buff[i - 13] ^ buff[i - 15]; buff[i + 1] = (val 1) | (val 31); val = buff[i - 1] ^ buff[i - 6] ^ buff[i - 12] ^ buff[i - 14]; buff[i + 2] = (val 1) | (val 31); val = buff[i + 0] ^ buff[i - 5] ^ buff[i - 11] ^ buff[i - 13]; buff[i + 3] = (val 1) | (val 31); val = buff[i + 1] ^ buff[i - 4] ^ buff[i - 10] ^ buff[i - 12]; buff[i + 4] = (val 1) | (val 31); val = buff[i + 2] ^ buff[i - 3] ^ buff[i - 9] ^ buff[i - 11]; buff[i + 5] = (val 1) | (val 31); val = buff[i + 3] ^ buff[i - 2] ^ buff[i - 8] ^ buff[i - 10]; buff[i + 6] = (val 1) | (val 31); val = buff[i + 4] ^ buff[i - 1] ^ buff[i - 7] ^ buff[i - 9]; buff[i + 7] = (val 1) | (val 31); } } The unsafe SIMD code is: public unsafe static void FillBuff(uint[] buffb) { fixed (uint* buff = buffb) { Vector4ui e; for (int t = 16; t buffb.Length; t += 4) { e = *((Vector4ui*)(buff [t-16])) ^ *((Vector4ui*)(buff [t-14])) ^ *((Vector4ui*)(buff [t- 8])) ^ *((Vector4ui*)(buff [t- 3])); e.W ^= buff[t]; buff[t] = (e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = (e.W 1) | (e.W 31) ^ ((e.X 2) | (e.X 30)); } } } The safe simd code is: public static void FillBuff(uint[] buff) { Vector4ui e; for (int t = 16; t buff.Length; t += 4) { e = new Vector4ui (buff [t-16],buff [t-15],buff [t-14],buff [t-13]) ^ new Vector4ui (buff [t-14],buff [t-13],buff [t-12],buff [t-11]) ^ new Vector4ui (buff [t-8], buff [t-7], buff [t-6], buff [t-5]) ^ new Vector4ui (buff [t-3], buff [t-2], buff [t-1], buff [t-0]); e.W ^= buff[t]; buff[t] =(e.X 1) | (e.X 31); buff[t + 1] = (e.Y 1) | (e.Y 31); buff[t + 2] = (e.Z 1) | (e.Z 31); buff[t + 3] = (e.W 1) | (e.W 31) ^ ((e.X 2) | (e.X 30)); } } ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd Acceleration Attributes
Rodrigo, My only problem with this is the language is tied to the x86 architecture, when Altivec or Paired Single etc are added for PowerPC then these attributes are nonsensical and will mean nothing to the user. This would be better done in a static location (rather than spread over the libraries) and split into a machine agnostic (Simd acceleration ON) and a machine specific manner (sse1 - 4.2 active). My 2c Russell From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rodrigo Kumpera Sent: 07 November 2008 15:15 To: Christophe Guillon Cc: mono-devel-list@lists.ximian.com Subject: Re: [Mono-dev] Mono.Simd Acceleration Attributes Hi Christophe, 2008/11/7 Christophe Guillon [EMAIL PROTECTED] Thank you for the explanation. It confirms my point and it seems that we agree. For the user guide aspect: 2) the attributes on the methods are never inspected by the runtime: they are there to guide the programmers using Mono.Simd in determining what kind of optimizations are usually available or currently enabled. If it is indeed just a guide to the user of Mono.Simd, thus why putting it in the library and coupling this with the specific architecture (SSE2 or other). The fact that it is an AddWithSaturation on a Vector16b is sufficient for the semantic. Then a note in the mono VM documentation can tell that on SSE2 architectures -O=simd will select the corresponding SSE2 op is sufficient. Optionaly a note in the library documentation can tell that mono normally should catch such calls on SSE2 architectures. We want to expose this information on the documentation as well and instead of having to dig this information twice we are planning on generating this part of the docs. For the choice of the accelerated or not accelerated mode at runtime: static readonly bool use_mono_simd = (SimdRuntime.AccelMode AccelMode.SSE2) != 0; ... if (use_mono_simd) // simd codepath else //scalar codepath If it is actually to overcome a temporary inneficiency due to some copy, it is imho far too intrusive in the user code. Here the user clearly wrote a code that is dependent on some external context, but instead of querying the actual VM runtime, or simply a user defined variable that can be found in some configuration file of the application, the query is on the Mono.Simd library itself. While in fact the library itself as no knowledge of the actual efficiency of the running VM. There are two good reasons for using this approach, the first one is because the user requires the best performance in all situations and want to know if it's method will be optimized or not. The second reason happens when there are many different ways to implement a given function, each one using different instruction sets and the user wants to have improved performance on newer processors. For example, there are 3 ways to implement dot product using Mono.Simd: 1) Only using sse1 and sse2 which takes 5 instructions (1 mul, 2 add and 2 shuffle) 2) Using sse3, which takes 3 instructions (1 mul, 2 hadd) 3) Using sse4.2 which takes 1 instruction (dotp) -- sse4.2 still not supported by Mono.Simd. For some users having this option is important and this is the main objective of the runtime query capabilities. Thus I fully agree with this (which is my point): Note that we may eventually either return the attribute not based on the metadata in the assembly, but based on the runtime understanding: this will avoid the need to have an updated Mono.Simd assembly when new optimizations are added. Just use the b pattern if you want to avoid that issue and remember that you don't usually need to check all the methods, but just the ones you actually need to be optimized. All the question there is, whether or not there is a way to get from the runtime this information and by which mean? Is it possible to have attributes attached (or simulated) by the runtime? The SimdRuntime.AccelMode property queries the runtime for the supported instruction sets. You might look at the implementation and get puzzled by the fact that it returns AccelMode.None, but in fact this is a magic method that the runtime takes special care and make sure it returns the right thing. Thanks for taking your time looking at the Mono.Simd library :) Cheers, Rodrigo This email has been scanned by the MessageLabs Email Security System DISCLAIMER This message and any attachments contain privileged and confidential information intended for the use of the addressee named above. If you are not the intended recipient of this message
Re: [Mono-dev] Mono.Simd Acceleration Attributes
Hi Russel, Our initial goal is to make simd instructions available to managed code. At first we thought about trying to make an instruction set agnostic library, but there are way too many quirks and differences between them that the result could be too crippled to be usable. There are quite many valid use cases for having the whole sse instruction set available and these are what we are targeting now. But then, this was an analysis based on the fact that no well known compiler/runtime exposes such library (arch agnostic simd), they always have a binding to a specific platform. This doesn't mean we just won't do it. Once we have, for example, Altivec and VFP supported if an usable common subset emerge, we'll work on making it available. Now back to the Acceleration attribute. It's meant to support not only sse, but others as well, they are not present for the simple reason that we didn't have the time for it. Anyway, the attribute right now should be considered an implementation detail and if it shows to be a problem in cases such as the one you describe we'll change it. Keep in mind that the current design is not final, but at the same time it's hard to change it based on assumptions. Thanks for the feedback, Rodrigo On Mon, Nov 10, 2008 at 10:04 AM, [EMAIL PROTECTED] wrote: Rodrigo, My only problem with this is the language is tied to the x86 architecture, when Altivec or Paired Single etc are added for PowerPC then these attributes are nonsensical and will mean nothing to the user. This would be better done in a static location (rather than spread over the libraries) and split into a machine agnostic (Simd acceleration ON) and a machine specific manner (sse1 – 4.2 active). My 2c Russell -- *From:* [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] *On Behalf Of *Rodrigo Kumpera *Sent:* 07 November 2008 15:15 *To:* Christophe Guillon *Cc:* mono-devel-list@lists.ximian.com *Subject:* Re: [Mono-dev] Mono.Simd Acceleration Attributes Hi Christophe, 2008/11/7 Christophe Guillon [EMAIL PROTECTED] Thank you for the explanation. It confirms my point and it seems that we agree. For the user guide aspect: 2) the attributes on the methods are never inspected by the runtime: they are there to guide the programmers using Mono.Simd in determining what kind of optimizations are usually available or currently enabled. If it is indeed just a guide to the user of Mono.Simd, thus why putting it in the library and coupling this with the specific architecture (SSE2 or other). The fact that it is an AddWithSaturation on a Vector16b is sufficient for the semantic. Then a note in the mono VM documentation can tell that on SSE2 architectures -O=simd will select the corresponding SSE2 op is sufficient. Optionaly a note in the library documentation can tell that mono normally should catch such calls on SSE2 architectures. We want to expose this information on the documentation as well and instead of having to dig this information twice we are planning on generating this part of the docs. For the choice of the accelerated or not accelerated mode at runtime: static readonly bool use_mono_simd = (SimdRuntime.AccelMode AccelMode.SSE2) != 0; ... if (use_mono_simd) // simd codepath else //scalar codepath If it is actually to overcome a temporary inneficiency due to some copy, it is imho far too intrusive in the user code. Here the user clearly wrote a code that is dependent on some external context, but instead of querying the actual VM runtime, or simply a user defined variable that can be found in some configuration file of the application, the query is on the Mono.Simd library itself. While in fact the library itself as no knowledge of the actual efficiency of the running VM. There are two good reasons for using this approach, the first one is because the user requires the best performance in all situations and want to know if it's method will be optimized or not. The second reason happens when there are many different ways to implement a given function, each one using different instruction sets and the user wants to have improved performance on newer processors. For example, there are 3 ways to implement dot product using Mono.Simd: 1) Only using sse1 and sse2 which takes 5 instructions (1 mul, 2 add and 2 shuffle) 2) Using sse3, which takes 3 instructions (1 mul, 2 hadd) 3) Using sse4.2 which takes 1 instruction (dotp) -- sse4.2 still not supported by Mono.Simd. For some users having this option is important and this is the main objective of the runtime query capabilities. Thus I fully agree with this (which is my point): Note that we may eventually either return the attribute not based on the metadata in the assembly, but based on the runtime understanding: this will avoid the need to have an updated Mono.Simd assembly when new
Re: [Mono-dev] Mono.Simd suggestion: Add static members for common values
Hi John, Default values are indeed an useful addition. So far we have focused on API completeness and not much about making it easier to use. It's on our plans to add such helpers. 2008/11/6 Hurliman, John [EMAIL PROTECTED] I'm in the process of converting over my OpenMetaverseTypes.dll library (basic 3D type library) to use Mono.Simd. One thing that is very handy to have is static members for common values, such as: public static readonly Vector4f Zero = new Vector4f(); public static readonly Vector4f One = new Vector4f(1f, 1f, 1f, 1f); public static readonly Vector4f MinusOne = new Vector4f(-1f, -1f, -1f, -1f); Which makes comparisons much easier: if (myvector4f == Vector4f.Zero) { … } -John ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd Acceleration Attributes
On 11/07/08 Christophe Guillon wrote: It seems that as soon as the Mono.Simd primitives have a well defined semantic it is not useful to specify which architecture feature is able to emulate each of these primitives. I would have expected this to be the choice of the virtual execution environment. It _is_ ultimately a choice of the runtime. These attributes are never inspected by the runtime to decide whether to optimize a method call or not. - if my underlying hardware XXX (not SSE2) is able to support efficiently add with saturation, I do not have to know whether SSE2 also supports it, the virtual machine for XXX can use the corresponding add with saturation instruction of XXX at the call sites of AddWithSaturation() anyway, When the runtime will implement that optimization, the attribute will be changed to include SSE2 and your architecture (say AltiVec or Neon etc). Yes, this requires a re-release of Mono.Simd, but it's not a big deal as the changes will be relatively rare and if you are happy to use unoptimized Mono.Simd anyway it doesn't matter. - if my underlying hardware features SSE2, the attribute is not useful, the virtual machine knows the underlying hardware and thus know that a SSE2 instruction is able to emulate this, It's useful to the Mono.Simd programmers, the runtime doesn't use it. - if the attribute is there to restrict the mapping to only SSE2 (and above) machines, it is an important restriction to the usage of the library. Imagine as above that I have in the future a hardware support XXX that is able to do AddWithSaturation on Vector16b; if I want a virtual machine to execute efficiently this primitive on XXX I would first have to modify the Mono.Simd library to add the corresponding XXX attribute and modify the primitives declaration to account for it. Nope, this is not correct. The behaviour is as follows: 1) the runtime will choose whether a method is optimized or not depending on the optimization flags (-O=simd, on by default) and on the features of the current processor. 2) the attributes on the methods are never inspected by the runtime: they are there to guide the programmers using Mono.Simd in determining what kind of optimizations are usually available or currently enabled. The reasoning is this: using unoptimized Mono.Simd is currently significantly slower than he equivalent scalar code. This has mostly to do with the additional copies that happen because of the operator overloading. This overhead is expected to decrease as we add more jit optimizations. So you have two cases: 1) the slowdown is not significant to you (you must test! Run your program with mono -O=simd and with mono -O=-simd): in this case you should ignore completely the acceleration attributes and just enjoy the speedup that the jit will give you when it can optimize the methods. 2) if the slowdown is significant you might want to have two codepaths, mostly in the same way in C/C++ you have a C implementation and a simd implementation of the critical functions. Now the question becomes: how do you choose at runtime if you want to use Mono.Simd or the scalar codepath? We offset two patters: a) do a coarse decision: you take a look at the methods you use in your algorithms and see that they are optimized when SSE2 is enabled, so you just do: static readobly bool use_mono_simd = (SimdRuntime.AccelMode AccelMode.SSE2) != 0; ... if (use_mono_simd) // simd codepath else //scalar codepath b) a fine-grained decision based on all or some of the methods you use: for each method you check (SimdRuntime.MethodAccelerationMode (typeof(...), ...) SimdRuntime.AccelMode) != 0 until you determine that enough of your methods are accelerated to make it worth using the Mono.Simd codepath. Note that we may eventually either return the attribute not based on the metadata in the assembly, but based on the runtime understanding: this will avoid the need to have an updated Mono.Simd assembly when new optimizations are added. Just use the b pattern if you want to avoid that issue and remember that you don't usually need to check all the methods, but just the ones you actually need to be optimized. lupus -- - [EMAIL PROTECTED] debian/rules [EMAIL PROTECTED] Monkeys do it better ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd Acceleration Attributes
Thank you for the explanation. It confirms my point and it seems that we agree. For the user guide aspect: 2) the attributes on the methods are never inspected by the runtime: they are there to guide the programmers using Mono.Simd in determining what kind of optimizations are usually available or currently enabled. If it is indeed just a guide to the user of Mono.Simd, thus why putting it in the library and coupling this with the specific architecture (SSE2 or other). The fact that it is an AddWithSaturation on a Vector16b is sufficient for the semantic. Then a note in the mono VM documentation can tell that on SSE2 architectures -O=simd will select the corresponding SSE2 op is sufficient. Optionaly a note in the library documentation can tell that mono normally should catch such calls on SSE2 architectures. For the choice of the accelerated or not accelerated mode at runtime: static readonly bool use_mono_simd = (SimdRuntime.AccelMode AccelMode.SSE2) != 0; ... if (use_mono_simd) // simd codepath else //scalar codepath If it is actually to overcome a temporary inneficiency due to some copy, it is imho far too intrusive in the user code. Here the user clearly wrote a code that is dependent on some external context, but instead of querying the actual VM runtime, or simply a user defined variable that can be found in some configuration file of the application, the query is on the Mono.Simd library itself. While in fact the library itself as no knowledge of the actual efficiency of the running VM. Thus I fully agree with this (which is my point): Note that we may eventually either return the attribute not based on the metadata in the assembly, but based on the runtime understanding: this will avoid the need to have an updated Mono.Simd assembly when new optimizations are added. Just use the b pattern if you want to avoid that issue and remember that you don't usually need to check all the methods, but just the ones you actually need to be optimized. All the question there is, whether or not there is a way to get from the runtime this information and by which mean? Is it possible to have attributes attached (or simulated) by the runtime? -- Christophe 2008/11/7 Paolo Molaro [EMAIL PROTECTED] On 11/07/08 Christophe Guillon wrote: It seems that as soon as the Mono.Simd primitives have a well defined semantic it is not useful to specify which architecture feature is able to emulate each of these primitives. I would have expected this to be the choice of the virtual execution environment. It _is_ ultimately a choice of the runtime. These attributes are never inspected by the runtime to decide whether to optimize a method call or not. - if my underlying hardware XXX (not SSE2) is able to support efficiently add with saturation, I do not have to know whether SSE2 also supports it, the virtual machine for XXX can use the corresponding add with saturation instruction of XXX at the call sites of AddWithSaturation() anyway, When the runtime will implement that optimization, the attribute will be changed to include SSE2 and your architecture (say AltiVec or Neon etc). Yes, this requires a re-release of Mono.Simd, but it's not a big deal as the changes will be relatively rare and if you are happy to use unoptimized Mono.Simd anyway it doesn't matter. - if my underlying hardware features SSE2, the attribute is not useful, the virtual machine knows the underlying hardware and thus know that a SSE2 instruction is able to emulate this, It's useful to the Mono.Simd programmers, the runtime doesn't use it. - if the attribute is there to restrict the mapping to only SSE2 (and above) machines, it is an important restriction to the usage of the library. Imagine as above that I have in the future a hardware support XXX that is able to do AddWithSaturation on Vector16b; if I want a virtual machine to execute efficiently this primitive on XXX I would first have to modify the Mono.Simd library to add the corresponding XXX attribute and modify the primitives declaration to account for it. Nope, this is not correct. The behaviour is as follows: 1) the runtime will choose whether a method is optimized or not depending on the optimization flags (-O=simd, on by default) and on the features of the current processor. 2) the attributes on the methods are never inspected by the runtime: they are there to guide the programmers using Mono.Simd in determining what kind of optimizations are usually available or currently enabled. The reasoning is this: using unoptimized Mono.Simd is currently significantly slower than he equivalent scalar code. This has mostly to do with the additional copies that happen because of the operator overloading. This overhead is expected to decrease as we add more jit optimizations. So you have two cases: 1) the slowdown is not significant to you (you must test! Run your
[Mono-dev] Mono.Simd Acceleration Attributes
Hi all, Looking at the proposal for the Mono.Simd primitives I'm wondering how the Mono.Simd.Acceleration attributes and the corresponding Mono.Simd.AccelMode parameters are useful. Thus I'm wondering what is the rational of having these attributes defined and used in the definition of the primitives. It seems that as soon as the Mono.Simd primitives have a well defined semantic it is not useful to specify which architecture feature is able to emulate each of these primitives. I would have expected this to be the choice of the virtual execution environment. For instance the add with saturation for the Vector16b type which is defined as: [Mono.Simd.Acceleration(Mono.Simd.AccelMode.SSE2)] public static Vector16bhttp://go-mono.com/docs/monodoc.ashx?link=T%3aMono.Simd.Vector16b *AddWithSaturation* (Vector16bhttp://go-mono.com/docs/monodoc.ashx?link=T%3aMono.Simd.Vector16bva, Vector16bhttp://go-mono.com/docs/monodoc.ashx?link=T%3aMono.Simd.Vector16bvb) Well, but: - if my underlying hardware XXX (not SSE2) is able to support efficiently add with saturation, I do not have to know whether SSE2 also supports it, the virtual machine for XXX can use the corresponding add with saturation instruction of XXX at the call sites of AddWithSaturation() anyway, - if my underlying hardware features SSE2, the attribute is not useful, the virtual machine knows the underlying hardware and thus know that a SSE2 instruction is able to emulate this, - if the attribute is there to restrict the mapping to only SSE2 (and above) machines, it is an important restriction to the usage of the library. Imagine as above that I have in the future a hardware support XXX that is able to do AddWithSaturation on Vector16b; if I want a virtual machine to execute efficiently this primitive on XXX I would first have to modify the Mono.Simd library to add the corresponding XXX attribute and modify the primitives declaration to account for it. -- Christophe ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd Acceleration Attributes
Hi Christophe, 2008/11/7 Christophe Guillon [EMAIL PROTECTED] Thank you for the explanation. It confirms my point and it seems that we agree. For the user guide aspect: 2) the attributes on the methods are never inspected by the runtime: they are there to guide the programmers using Mono.Simd in determining what kind of optimizations are usually available or currently enabled. If it is indeed just a guide to the user of Mono.Simd, thus why putting it in the library and coupling this with the specific architecture (SSE2 or other). The fact that it is an AddWithSaturation on a Vector16b is sufficient for the semantic. Then a note in the mono VM documentation can tell that on SSE2 architectures -O=simd will select the corresponding SSE2 op is sufficient. Optionaly a note in the library documentation can tell that mono normally should catch such calls on SSE2 architectures. We want to expose this information on the documentation as well and instead of having to dig this information twice we are planning on generating this part of the docs. For the choice of the accelerated or not accelerated mode at runtime: static readonly bool use_mono_simd = (SimdRuntime.AccelMode AccelMode.SSE2) != 0; ... if (use_mono_simd) // simd codepath else //scalar codepath If it is actually to overcome a temporary inneficiency due to some copy, it is imho far too intrusive in the user code. Here the user clearly wrote a code that is dependent on some external context, but instead of querying the actual VM runtime, or simply a user defined variable that can be found in some configuration file of the application, the query is on the Mono.Simd library itself. While in fact the library itself as no knowledge of the actual efficiency of the running VM. There are two good reasons for using this approach, the first one is because the user requires the best performance in all situations and want to know if it's method will be optimized or not. The second reason happens when there are many different ways to implement a given function, each one using different instruction sets and the user wants to have improved performance on newer processors. For example, there are 3 ways to implement dot product using Mono.Simd: 1) Only using sse1 and sse2 which takes 5 instructions (1 mul, 2 add and 2 shuffle) 2) Using sse3, which takes 3 instructions (1 mul, 2 hadd) 3) Using sse4.2 which takes 1 instruction (dotp) -- sse4.2 still not supported by Mono.Simd. For some users having this option is important and this is the main objective of the runtime query capabilities. Thus I fully agree with this (which is my point): Note that we may eventually either return the attribute not based on the metadata in the assembly, but based on the runtime understanding: this will avoid the need to have an updated Mono.Simd assembly when new optimizations are added. Just use the b pattern if you want to avoid that issue and remember that you don't usually need to check all the methods, but just the ones you actually need to be optimized. All the question there is, whether or not there is a way to get from the runtime this information and by which mean? Is it possible to have attributes attached (or simulated) by the runtime? The SimdRuntime.AccelMode property queries the runtime for the supported instruction sets. You might look at the implementation and get puzzled by the fact that it returns AccelMode.None, but in fact this is a magic method that the runtime takes special care and make sure it returns the right thing. Thanks for taking your time looking at the Mono.Simd library :) Cheers, Rodrigo ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
[Mono-dev] Mono.Simd API Suggestions
Hey Jonathan, Thanks for taking some time looking at the Mono.Simd API and doing some suggestions but, please, do then on a more visible mailing list such as mono-devel. Just perusing through the Mono.Simd API, and one question (and a few other suggestions) occurs to me: Why the non-reliance on method overloading? Right now mostly due to implementation details, so no reason at all. I'm still playing with the option of using extension methods. It would have the same benefit of overloading, would reduce typing and not make people mad when changing the underlying type. It's a mater of choosing between a.UnpackLow (b) Vector2l.UnpackLow (a,b) and VectorOperations.UnpackLow (a,b). Feedback on this subject is more than welcome. Only a small part of the operations are available for all types and some are under different instruction sets. This should be enough to make it pretty confusing for the user. On a completely different note (and to start a bikeshed discussion ;-), why ShiftRightLogic? Wouldn't LogicalRightShift be more conventional? We should also avoid abbreviations, so SubtractWithSaturation() would be better than SubWithSaturation()... There is a very reasonable and compelling argument for that. I'm very bad at naming methods and only now people are starting to look more deeply at it. The shift one is a pretty bad choice indeed. OTOH, on some cases this might lead to some very very long method names. Thanks, Rodrigo ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd API Suggestions
Hi Jonathan, Answering your others suggestions. Other suggestions: SimdRuntime.IsMethodAccelerated() and SimdRuntime.MethodAccelerationMode() should be overloaded to accept a MethodInfo of the desired method, as it can be ~trivial to get a MethodInfo in a static, type-checked fashion, e.g.: MethodInfo average = ((FuncVector8us, Vector8us, Vector8us) Vector8us.Average).Method; bool b = SimdRuntime.IsMethodAccelerated (average); Your idea is quite interesting. The MethodInfo overload is kind of useful, but letting the user just pass a delegate for the proper type is way nicer. Even better, this is _faster_ than typeof(Vector8us).GetMethod (Average). (Not by a lot, but faster nonetheless.) Well, speed is irrelevant on this case as this is a startup thing and it doesn't make sense to use it during execution. AccelerationAttribute should be AcceleratedOnAttribute, and AccelMode should be InstructionSet. I think this would make for more readable documentation prototypes: Makes sense. Finally (for now), parameter names should be more consistent. On some methods the arguments are (va, vb) (e.g. Vector8us.Average()), while on others they're (v1, v2) (e.g. Vector2d.InterleaveHigh()). I don't care which we choose, but we should stick to something and use it consistently (`a` and `b`, perhaps?). There is little consistency on this and sticking to a single naming will be relevant once C# 4.0 is released. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
[Mono-dev] Mono.Simd suggestion: Add static members for common values
I'm in the process of converting over my OpenMetaverseTypes.dll library (basic 3D type library) to use Mono.Simd. One thing that is very handy to have is static members for common values, such as: public static readonly Vector4f Zero = new Vector4f(); public static readonly Vector4f One = new Vector4f(1f, 1f, 1f, 1f); public static readonly Vector4f MinusOne = new Vector4f(-1f, -1f, -1f, -1f); Which makes comparisons much easier: if (myvector4f == Vector4f.Zero) { ... } -John ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Mono.Simd API Suggestions
On Thu, 2008-11-06 at 12:04 -0200, Rodrigo Kumpera wrote: Thanks for taking some time looking at the Mono.Simd API and doing some suggestions but, please, do then on a more visible mailing list such as mono-devel. Because I'm an idiot who saw mono-d... and assumed it was mono-devel-list. My bad... Just perusing through the Mono.Simd API, and one question (and a few other suggestions) occurs to me: Why the non-reliance on method overloading? ... It's a mater of choosing between a.UnpackLow (b) Vector2l.UnpackLow (a,b) and VectorOperations.UnpackLow (a,b). Feedback on this subject is more than welcome. I'm not even sure what UnpackLow() does (and reading the source only helps a little...). Regardless, I don't like Vector2l.UnpackLow(a,b), and would prefer the first or last options. That said, I'm not sure which is better between static method syntax and instance method syntax. The advantage to instance method syntax is that you can readily find the method through the IDE (code completion ftw!), and should likely be preferred for that reason alone. On the other hand, an instance method _may_ imply that the instance variable will be modified by the method call, which is not the case for .UnpackLow(). (Then again, this implication is already bogus; see System.String instance methods...) So for usability/findability, I'd suggest the instance method syntax (even if it's really done via extension methods). Only a small part of the operations are available for all types and some are under different instruction sets. This should be enough to make it pretty confusing for the user. I'm not sure about that, but it is something to keep in mind. So the question remains, which is easier for the user to understand: - instance methods, documented in the relevant type. Pro: You know which operations are available specifically for a given type. Con: You can't easily see which other types support the same operation. This may not be relevant at all; I don't know. - Static methods on a e.g. VectorOperations type. Pro: You can easily determine which operations are available across numerous types. Con: You can't tell from the type's documentation which operations are supported. - Extension methods. Pro: Methods are referenced from the type documentation and, since extension methods can be in the same extension class, they can also be listed as overloads. This easily allows determining which operations are common across types AND which operations are supported on a specific type from that type's documentation. Con: Requires C# 3.0. (Is this really a con?) From that breakdown, it looks like extension methods are best. :-) On a completely different note (and to start a bikeshed discussion ;-), why ShiftRightLogic? Wouldn't LogicalRightShift be more conventional? We should also avoid abbreviations, so ... OTOH, on some cases this might lead to some very very long method names. Even vim supports code completion [0], so I don't consider this to be a significant problem... - Jon [0] Ctrl+N/Ctrl+P will complete words already present within the current buffer and also use any words found in a `ctags` file, if available. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list