On 15 Mar 2007 19:58:54 +0300, Egor Pasko <[EMAIL PROTECTED]> wrote:
this should hypothetically improve one simple code pattern (that is
probably widely used?), i.e.: for (int i=0;i<I;i=++){A[i]=X;}
What I figured out looking at the patch:
* [pass.2] does not seem to throw any AIOutOfBoundsException
* [pass.2] does not have any useful tuning parameters such as number
of unrolls per loop, thus the scheme eats potential benefit
from loop unrolling and does not give any tuning back
AFAIK this optimization is much more efficient then loop unrolling on
microtest you mentioned.
* [pass.1] detects such a rare pattern that I doubt it would benefit a
user (but obviously will benefit a you-know-which-benchmark
runner)
Generic Arrays.fill-like methods could be optimized this way.
+ IMO even several percents in widely known
benchmarks is a reason to implement even more complicated optimizations.
* [pass.1] has a lot of new code that introduces potential instability
(if the pattern was detected not properly, the code does
not read easily), but does not contain a single unit test
or the like. Together with AIOOBE issue stability becomes a
real question.
All known bugs can be fixed. If AIOOBE is the a real problem here - it looks
to be easily fixed too. The question if the optimization gives any benefit
or not.
We can move it into separate HLO pass (and separate
file) and drop it from codebase if it's not needed in future.
* back branch polling is not performed (which is probably good for
performance, but I would better have a tuning option)
Do you think that the latency of mem-copying like opt can be a problem here?
What I can say more is that a good "ABCD" optimization complimented
with "loop versioning" optimiztion will make a more readable, more
stable code, AND will give a better performance gain (loop unrolling
is awake too). Setting aside the fact that the overall design will be
more straightforward (having no interdependent passes, extra helpers, etc)
So I vote for focusing on ABCD plus "loop versioning" and leaving
specific benchmark-oriented tricks (complicating our design) alone.
I support focusing on loop
versioning/ABCD and other general purpose optimization we do not have today.
And until we do not get from these opts better results for your microtest we
can use Nikolay's approach. At least it's works better today.
?
--
Mikhail Fursov