hi Micael

Thanks for the patch, I pushed it.

Maybe I will try to reproduce your performance strangeness later... what did
you do to switch between inline/non-inline?  Simply remove the "inline"
keyword?  By the way, you can check what the compiler has done with 
"objdump -S mypaint-tiled-surface.os".  Not sure if there is another
relevant optimization step at linking time, though.

If you're adding profiling code so close to the critical loop, chances are
that the compiler will find different optimizations, or that you produce a
completely different memory access pattern, using different CPU caches... 
just guessing.  If it's inlining stuff and unrolling the for loop, it might
even realize that you never use most of the return values of that function,
and not bother calculating most of them.

I have measured with tests/test_performance.py -c 7:

1.419 seconds (no AA)
1.554 seconds (AA, new code)
1.550 seconds (AA, old code)

To switch between AA and non-AA I have simply changed the radius threshold
to an extreme value.  The difference between the last two disappears in the
measurement noise of that test.  The difference to the non-AA version is
surprisingly small with this test.

Regards
Martin

On Wed, Jan 23, 2013 at 05:32:33AM +0000, Micael wrote:
> Once again I discovered bad behaviour on AA when an elliptical brush
> is rotated, therefore I decided to rewrite the function. I'm attaching
> the patch. I have tested this quiet a bit and it seems to solve all
> issues and I think it even improves AA quality.
> 
> Since the function is more complex than the previous, I decided to do
> some profiling. I have pasted a snippet of the code I used to do this
> here: http://pastebin.com/MpXFQCx6
> 
> The results are VERY weird.
> This is what I get for not inlined calls:
> 
> no-aa time: 856 ms
> aa time: 816 ms
> 
> And this is what I get for inlined calls:
> 
> no-aa time: 855 ms
> aa time: 482 ms
> 
> So it appears that the "calculate_rr" function is not getting inlined
> for some reason, but even when not inlined "calculate_rr_antialiased"
> finishes faster for some reason I can't explain. Either some compiler
> voodoo is happening here or my profiling code is bogus.
> 
> I have also noticed that when the brush aspect ratio is very high and
> the hardness is low, the dark spot at the center of the dabs tend to
> disappear, however this is due to it being antialiased and I believe
> there's little I can do here.
> 
> -- 
> Micael Dias



-- 
Martin Renold

_______________________________________________
Mypaint-discuss mailing list
[email protected]
https://mail.gna.org/listinfo/mypaint-discuss

Reply via email to