On 16-May-07, at 10:44 AM, [EMAIL PROTECTED] wrote: > On May 16, 2007, at 09:14 UTC, Frank Condello wrote: > >>> So - here is the 40 million dollar question - WHY is RB nearly 10 >>> times slower than C ? >> >> Typically the blame is laid on the lack of (or lax) compiler >> optimizations. > > That blame would be misplaced in this case, though.
In this case, I think you're right, but... >> RB loops are simply much slower than optimized C >> loops, even with all the "speedy" pragmas in place. > > No, I don't think they are. Looping is pretty fast. But this > particular loop is doing two virtual method calls per inner iteration, > compared to the C code which is doing no function calls of any sort. > That is almost certainly where all the extra time is going. Daniel's profiling a loop that uses direct Ptr access - there are no function calls. His loops may not be entirely equivalent but he's using a reasonable approach in both cases. Unfortunately RB forces you to write uglier code due to the lack of strongly typed pointers, but that's a different problem... The report evaluation suggests he wasn't testing in release builds which is why he was seeing such a big discrepancy. However the report makes some incorrect assumptions as well - basically this test is flawed. Daniel's code is making a function call in the dylib case and no function calls in the RB case so if all things were equal RB should've been _way_ faster, not just marginally faster. For a proper comparison the C loop would need to contain the outer loop as well - Trust me, I've done a lot of testing here myself, and even when using Ptr and all the appropriate pragmas C loops simply spank RB loops in real world applications. I typically get around a 5x increase with C code, more if using trig functions since RB insists on wrapping those in an extra function call. C also gives you an opportunity to optimize further in some situations - E.g. a vector array Normalize function can use an inlined single-precision sqrt approximation that's impossible to reasonably implement in RB code. My C vec3.normalize function is over 30 times faster than anything I could come up with in RB, and it's not for the lack of trying. You can gain a lot speed in RB by manually unrolling loops but that's not always appropriate (or pretty). I think we simply need a more reasonable test case to pinpoint the problem areas. Frank. <http://developer.chaoticbox.com/> _______________________________________________ Unsubscribe or switch delivery mode: <http://www.realsoftware.com/support/listmanager/> Search the archives: <http://support.realsoftware.com/listarchives/lists.html>
