"2) Reporter's assumption about fstp is wrong: the first fstp instruction removes value from fpu stack, so it cannot be used for the second time without first reloading value onto stack."
Compiler should reuse loaded value (a[i]) and store to a[i] using fstl, then fstpl to a[i+1] On Fri, Jan 24, 2014 at 12:26 AM, Sergei Gorelkin <sergei_gorel...@mail.ru>wrote: > 24.01.2014 3:04, Martin Frb пишет: > > On 23/01/2014 22:26, August Oktobar wrote: >> >>> Hello, I have seen your mails about peephole optimization, so I wonder >>> if you could look at this >>> reports >>> http://bugs.freepascal.org/view.php?id=23595 >>> >>> or perhaps optimize slow array access using operator [] (it is faster to >>> use pointer arithmetics) >>> >>> thanks! >>> >> >> 1) I am just getting started on this, so I can only answer with limited >> knowledge. >> @experts, please correct me below, where needed >> 2) The peephole opt is only something I do "on the side" as it currently >> is. >> >> From a quick glance, this does not look like something for the peephole >> opt. >> >> The peephole opt currently looks at statements that are close together >> (follow each other immediately). >> I am not sure to which extend (if at all) it would be acceptable to break >> that limit (it is doable, >> question is if desired). >> >> In this specific case: >> 1) between the "fstpl" and the "mov (half_the_data), %edi" are other >> statements. >> detecting the connection would either: >> - need scanning several statements ahead. This would be slow, because >> it had to be done after >> each storing "something to memory" (so very often) >> - keeping state of all the involved registers and memory (do-able / >> interesting at least from a >> theoretical view / but not sure if desired) >> 2) only half the mem is accessed, and then the other half. That means to >> detect the connection >> between the mem read and the register, it is needed to analyse 4 >> statements. Very unlikely to see >> this in the peephole opt. >> >> The livelihood of "a[i]" needs to be checked where the code is generated. >> >> > 1) You are right that it's not the job for peephole analyzer, it is > typical common subexpression elimination. > 2) Reporter's assumption about fstp is wrong: the first fstp instruction > removes value from fpu stack, so it cannot be used for the second time > without first reloading value onto stack. > 3) The assignments of floating-point values are currently being generated > using integer instructions, hence the subsequent code. This way it doesn't > depend on number of available FPU registers, which is hard to know at any > point. > > Regards, > Sergei > _______________________________________________ > fpc-devel maillist - fpc-devel@lists.freepascal.org > http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel >
_______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel