Hi Alan, There a couple of issues with your code, let me get on them:
-Until recently (last night), getters were not accelerated, which causes a significant slowdown. I fixed this in r118899. The generated code is not as good as it could be, but this will be fixed eventually. -Setters are still not accelerated, I'll work on this next week, so until then your code will suffer. -Once you use a single non-accelerated method on a Vector variable all operations on it will be slower due to how our JIT works - they still use sse instructions, but with a performance penalty. -Getters and setter are a hint of ill vectorized code. The last part of your unsafe code should use temps for the intermediate results. -In the unsafe case you should use a Vector4ui store instead of extracting each element. -For the safe case we still miss proper integration with arrays, in the form of methods to extract and store vectors from them. Your code looks a bit strange, the Vector4ui constructor indexes in particular. Have you checked that the output of the 3 methods are the same? I'll work on the Mono.Simd issues next week, getting setters to be accelerated, some methods to better integrate with arrays and other things like element extractors. Rodrigo On Sat, Nov 15, 2008 at 12:13 AM, Alan McGovern <[EMAIL PROTECTED]>wrote: > I found a bit of code in the SHA1 implementation which i thought was > ideal for SIMD optimisations. However, unless i resort to unsafe code, > it's actually substantially slower! I've attached three > implementations of the method here. The original, the safe SIMD and > the unsafe SIMD. The runtimes are as follows: > > Original: 600ms > Unsafe Simd: 450ms > Safe Simd: 1700ms > > Also, the method is always called with a uint[] of length 80. > > Is this just the wrong place to be using simd? It seemed ideal because > i need 75% less XOR's. If anyone has an ideas on whether SIMD could > actually be useful for this case or not, let me know. > > Thanks, > Alan. > > > The original code is: > > private static void FillBuff(uint[] buff) > { > uint val; > for (int i = 16; i < 80; i += 8) > { > val = buff[i - 3] ^ buff[i - 8] ^ buff[i - 14] ^ buff[i - > 16]; > buff[i] = (val << 1) | (val >> 31); > > val = buff[i - 2] ^ buff[i - 7] ^ buff[i - 13] ^ buff[i - > 15]; > buff[i + 1] = (val << 1) | (val >> 31); > > val = buff[i - 1] ^ buff[i - 6] ^ buff[i - 12] ^ buff[i - > 14]; > buff[i + 2] = (val << 1) | (val >> 31); > > val = buff[i + 0] ^ buff[i - 5] ^ buff[i - 11] ^ buff[i - > 13]; > buff[i + 3] = (val << 1) | (val >> 31); > > val = buff[i + 1] ^ buff[i - 4] ^ buff[i - 10] ^ buff[i - > 12]; > buff[i + 4] = (val << 1) | (val >> 31); > > val = buff[i + 2] ^ buff[i - 3] ^ buff[i - 9] ^ buff[i - > 11]; > buff[i + 5] = (val << 1) | (val >> 31); > > val = buff[i + 3] ^ buff[i - 2] ^ buff[i - 8] ^ buff[i - > 10]; > buff[i + 6] = (val << 1) | (val >> 31); > > val = buff[i + 4] ^ buff[i - 1] ^ buff[i - 7] ^ buff[i - 9]; > buff[i + 7] = (val << 1) | (val >> 31); > } > } > > The unsafe SIMD code is: > public unsafe static void FillBuff(uint[] buffb) > { > fixed (uint* buff = buffb) { > Vector4ui e; > for (int t = 16; t < buffb.Length; t += 4) > { > e = *((Vector4ui*)&(buff [t-16])) ^ > *((Vector4ui*)&(buff [t-14])) ^ > *((Vector4ui*)&(buff [t- 8])) ^ > *((Vector4ui*)&(buff [t- 3])); > e.W ^= buff[t]; > > buff[t] = (e.X << 1) | (e.X >> 31); > buff[t + 1] = (e.Y << 1) | (e.Y >> 31); > buff[t + 2] = (e.Z << 1) | (e.Z >> 31); > buff[t + 3] = (e.W << 1) | (e.W >> 31) ^ ((e.X << 2) | (e.X >> > 30)); > } > } > } > > The safe simd code is: > public static void FillBuff(uint[] buff) > { > Vector4ui e; > for (int t = 16; t < buff.Length; t += 4) > { > e = new Vector4ui (buff [t-16],buff [t-15],buff > [t-14],buff [t-13]) ^ > new Vector4ui (buff [t-14],buff [t-13],buff > [t-12],buff [t-11]) ^ > new Vector4ui (buff [t-8], buff [t-7], buff > [t-6], buff [t-5]) ^ > new Vector4ui (buff [t-3], buff [t-2], buff > [t-1], buff [t-0]); > > e.W ^= buff[t]; > buff[t] = (e.X << 1) | (e.X >> 31); > buff[t + 1] = (e.Y << 1) | (e.Y >> 31); > buff[t + 2] = (e.Z << 1) | (e.Z >> 31); > buff[t + 3] = (e.W << 1) | (e.W >> 31) ^ ((e.X << 2) | > (e.X >> 30)); > } > } > _______________________________________________ > Mono-devel-list mailing list > Mono-devel-list@lists.ximian.com > http://lists.ximian.com/mailman/listinfo/mono-devel-list >
_______________________________________________ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list