Re: Shocking difference in performance between RB and C when copying float data between memoryblocks

Daniel Stenning Wed, 16 May 2007 03:46:16 -0700

Also just tried rewriting the RB loop to not use STEP:

  dim index as integer
  for j as integer = 0 to iterations
    for i as integer = 0 to count
      'pB.UInt32(i) = pA.UInt32(i)
      pB.Single(index) = pA.Single(index) //* count
      index = index+4
    Next
    index = 0
  Next


But in fact this comes in at about 55 seconds, versus 45 when using STEP 4.

On 16/5/07 11:31, "Daniel Stenning" <[EMAIL PROTECTED]> wrote:

> I just tried rerunning the test but with all the contents of the inner loop
> commented out - 
> 
> So in the RB PTR case:
> 
>   dim u as integer = Count * 4
>    for j as integer = 0 to iterations
>      for i as integer = 0 to u step 4
>        // pB.Single(i) = pA.Single(i)
>      Next
>    Next
> 
> And in C dylib:
> 
>  void copySingle (int count, float * A, float * B ) {
>     int i;
>     for ( i = 0; i < count ; i++ )  {
>        // B[i] = A[i];
>     }
>  };
> 
> The C dylib version went down to around 2 seconds ( I think it was less )
> While the RB version is around 19 seconds.
> 
> 
> 
> On 16/5/07 02:21, "Daniel Stenning" <[EMAIL PROTECTED]> wrote:
> 
>> Heres a simple test I did which creates simply copies 44100 floats from one
>> memoryblock to another.  It does this 3000 times in order to allow timing to
>> be done.
>> 
>> I did this initially to see if there is a significant performance gain from
>> using Ptr pointers to the memoryblocks to do the reads and copies, versus
>> using the memoryblock SingleValue() methods.
>> 
>> Indeed there is. Using Ptr is faster.
>> 
>> But out of curiosity I thought I'd see how near the performance was compared
>> to doing the core copy operation in C - using a dylib. So I wrote a quick
>> dylib with a single function:
>> 
>> void copySingle (int count, float * A, float * B ) {
>>    int i;
>>    for ( i = 0; i < count ; i++ )  {
>>       B[i] = A[i];
>>    }
>> };
>> 
>> 
>> And then via a declare called this function 3000 times. ( passing 3000 as
>> the count argument )
>> 
>> The time taken in RB using Ptr:  around 45 seconds
>> And in by calling the C dylib ? :   5 seconds.
>> 
>> This is on a G4 PowerBook.
>> 
>> I was NOT expecting that, and that 45 second timing was even after using
>> pragmas to reduce overheads.
>> 
>> I was expecting RB to be a little slower than C but not  THAT much slower.
>> After all - RB is supposed to be natively compiled - thus comparable to C.
>> 
>> Here is the code for the RB version:
>> ****************************************************************
>> 
>>   #pragma DisableAutoWaitCursor
>>   #pragma DisableBackgroundTasks
>>   #pragma DisableBoundsChecking
>>   #pragma StackOverflowChecking false
>>   
>>   Dim Count As integer = 44100 -1
>>   DIm iterations as integer = 3000
>>   
>>   // source buffer
>>   Dim A as new MemoryBlock( 44100 * 4)
>>   
>>   // stuff A with incrementing single values
>>   for i as integer = 0 to Count
>>     A.SingleValue(i*4) = i
>>   Next
>>   
>>   // target buffer
>>   Dim B as new MemoryBlock( 44100 * 4)
>>   
>>   // use pointers for speed
>>   dim pA As Ptr = A
>>   dim pB as Ptr = B
>>   
>>   Result.Text = "Started"
>>   app.DoEvents
>>   
>>   dim u as integer = Count * 4
>>   for j as integer = 0 to iterations
>>     for i as integer = 0 to u step 4
>>       pB.Single(i) = pA.Single(i)
>>     Next
>>   Next
>>   
>>   Beep
>>   
>>   Result.Text= "FINISHED"
>> 
>> *************************************
>> And here is the version that calls the C function in the dylib:
>> 
>>   
>>   soft Declare Sub copySingle  Lib LIBPATH ( c as integer, ptA as Ptr, ptB
>> as Ptr )
>>   
>>   Dim Count As integer = 44100
>>   DIm iterations as integer = 3000
>>   
>>   Dim A as new MemoryBlock( 44100 * 4)
>>   
>>   // stuff A with incrementing single values
>>   for i as integer = 0 to Count-1
>>     A.SingleValue(i*4) = i
>>   Next
>>   
>>   Dim B as new MemoryBlock( 44100 * 4)
>>   
>>   dim pA As Ptr = A
>>   dim pB as Ptr = B
>>   
>>   Result.Text = "Started"
>>   app.DoEvents
>>   
>>   dim u as integer = Count
>>   for j as integer = 0 to iterations
>>     copySingle( u, pA, pB )
>>   Next
>>   
>>   Beep
>>   Result.Text= "FINISHED"
>> 
>> 
>> 
>> So - here is the 40 million dollar question - WHY is RB nearly 10 times
>> slower than C ?
>> Is this something that can and should be adressed by RS ?
>> 
>> 
>> _______________________________________________
>> Unsubscribe or switch delivery mode:
>> <http://www.realsoftware.com/support/listmanager/>
>> 
>> Search the archives:
>> <http://support.realsoftware.com/listarchives/lists.html>
>> 
> 
> Regards,
> 
> Dan
> 
> 
> 
> _______________________________________________
> Unsubscribe or switch delivery mode:
> <http://www.realsoftware.com/support/listmanager/>
> 
> Search the archives:
> <http://support.realsoftware.com/listarchives/lists.html>
> 

Regards,

Dan



_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Shocking difference in performance between RB and C when copying float data between memoryblocks

Reply via email to