You can try a few potential optimizations in the D version yourself and see if it makes a difference.

Devirtualization has a very small impact. Test this by making `test` take `SubFoo` and making `bar` final, or making `bar` a stand-alone function.

That's not it.

Inlining alone doesn't make a huge difference either - test this by copy/pasting the `bar` method body to the test function.

But we can see a *huge* difference if we inline AND make the data local:

int test(SubFoo obj, int repeat) {
        int i = obj.i; // local variable copy
        for (int r = 0; r<repeat; ++r) {
                //obj.bar();
                i = i *3 + 1; // do the math on the local
        }
obj.i = i; // save it back to the object so same result to the outside
world
        return obj.i;
}



That cuts the time to less than 1/2 on my computer from the other fastest version.

So I suspect the JVM is able to figure out that the `i` member is being used and putting it in a hot cache instead of accessing it indirectly though the object, just like I did by hand there.

I betcha if the loop ran 5 times, it would be no different, but the JVM realizes after hundreds of iterations that there's a huge optimization potential there and rewrites the code at that point, making it faster for the next million runs.

Reply via email to