Here is a link to and entry in bugzilla with attached code.  I could not send 
to the list:

http://bugzilla.xamarin.com/show_bug.cgi?id=2098

On Nov 20, 2011, at 7:41 AM, Jonathan Shore wrote:

> Did the code I attached get filtered?  I'll post the tar.gz into bugzilla and 
> send the link.
> 
> Below are code snippets to calculate Ordinary Least Squares, a simpler 
> example.   I found this to be 4x slower than C++ / Java:
> 
> Here is the "safe" and "unsafe" versions of OLS which I ran on an array size 
> of 50,000 10,000 x:
> 
> public class SafeOLS
> {
>       public static double OLS (double[] x, double[] y)
>       {
>               var eXY = 0.0;
>               var eXX = 0.0;
>               var eX = 0.0;
>               var eY = 0.0;
>                       
>               var len = x.Length;
>               
>               for (int i = 0 ; i < len ; i++)
>               {
>                       var vx = x[i];
>                       var vy = y[i];
>               
>                       eXY += vx*vy;
>                       eXX += vx*vx;
>                       eX += vx;
>                       eY += vy;
>               }
>               
>               var n = (double)len;
>               return (eXY - eX * eY / n) / (eXX - eX * eX / n);
>       }
> }
> 
> 
> public class UnSafeOLS
> {
>       unsafe public static double OLS (double[] x, double[] y)
>       {
>               var eXY = 0.0;
>               var eXX = 0.0;
>               var eX = 0.0;
>               var eY = 0.0;
>               
>               var len = x.Length;
>               
>               fixed (double* px = x)
>               fixed (double* py = y)
>               {
>                       double* vpx = px;
>                       double* vpy = py;
>                       
>                       for (int i = 0 ; i < len ; i++)
>                       {
>                               var vx = *vpx++;
>                               var vy = *vpy++;
>                       
>                               eXY += vx*vy;
>                               eXX += vx*vx;
>                               eX += vx;
>                               eY += vy;
>                       }
>               }
>                       
>               var n = (double)len;
>               return (eXY - eX * eY / n) / (eXX - eX * eX / n);
>       }
> }
> 
> 
> One can use the following as a driver, parameterized with 50000, 10000 or 
> something like that:
> 
> private static void TestUnSafeOLS (int dim, int iterations)
> {
>       double[] x = new double[dim];
>       double[] y = new double[dim];
> 
>       for (int i = 0 ; i < x.Length ; i++)
>       {
>               x[i] = i;
>               y[i] = i*i / 1000.0;
>       }
> 
>       Stopwatch watch = new Stopwatch ();
>       watch.Start();
>                       
>       double sum = 0;
>       for (int i = 0 ; i < iterations ; i++)
>       {
>               sum += UnSafeOLS.OLS (x,y);
>               x[100] = sum;
>       }
>                       
>       watch.Stop();
>       Console.WriteLine ("unsafe ols: " + sum + ", elapsed: " + 
> watch.Elapsed);
> }
> 
> 
> Here is the C++ version of OLS:
> 
> 
> static double OLS (double* x, double* y, int len)
> {
>       double eXY = 0.0;
>       double eXX = 0.0;
>       double eX = 0.0;
>       double eY = 0.0;
>       
>       for (int i = 0 ; i < len ; i++)
>       {
>               double vx = x[i];
>               double vy = y[i];
>       
>               eXY += vx*vy;
>               eXX += vx*vx;
>               eX += vx;
>               eY += vy;
>       }
>       
>       double n = (double)len;
>       return (eXY - eX * eY / n) / (eXX - eX * eX / n);
> }
> 
> static void TestOLS (int dim, int iterations)
> {
>       double* x = new double[dim];
>       double* y = new double[dim];
> 
>       for (int i = 0 ; i < dim ; i++)
>       {
>               x[i] = i;
>               y[i] = i*i / 1000.0;
>       }
> 
>       long Tstart = CurrentTimeMilli();
>       
>       double sum = 0;
>       for (int i = 0 ; i < iterations ; i++)
>       {
>               sum += OLS (x,y, dim);
>               x[100] = sum;
>       }
>       
>       long Tend = CurrentTimeMilli();
>       long Telapsed = (Tend-Tstart);
>       
>       printf ("OLS: %lf, elapsed: %02d:%02d:%03d\n", sum, (int)(Telapsed / 
> 60000), (int)(Telapsed % 60000) / 1000, (int)(Telapsed % 1000));
> }
> 
> int main (int argc, char *argv[])
> {
>       TestOLS (50000, 100000);
>       return 0;
> }
> 
>  
> Thanks in advance for any pointers and analysis:
> 
> I will send another post with the link in a bit.
> Jonathan
> 
> 
> On Nov 20, 2011, at 3:28 AM, Stefanos A. wrote:
> 
>> 2011/11/20 Jonathan Shore <jonathan.sh...@gmail.com>
>> Slide, not really.  If mono SIMD had a more general mapping to the GPU, or 
>> could operate on very large vectors or matrices, possibly.   Linear algebra 
>> is an easy mapping to that stuff.   However, I do more complicated stuff 
>> around timeseries, so does not really fit into linear alg stuff.
>> 
>> I guess, what I'm really after is to understand why the unsafe 
>> implementation is hardly faster than the "safe" version.   Whereas on the 
>> .NET CLR is 2x as fast, and nearly as fast as the C++ implementation.    
>> There is no GC or object creation involved here, just arrays and 
>> computations.
>> 
>> Without sharing some code, it's almost impossible to tell what might be the 
>> cause of the discrepancy or any ways to improve performance. Have you 
>> measured performance with the regular JITter rather than LLVM?
> 

_______________________________________________
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list

Reply via email to