On Fri, Jul 10, 2015 at 11:39 PM, Abe <abe_skol...@yahoo.com> wrote:
>> The GIMPLE level if-conversion code was purely
>> written  to make loops suitable for vectorization.
>
>
> I`m not surprised to read that.
>
>
>> It wasn't meant to provide if-conversion of
>> scalar code in the end (even though it does).
>
>
> Serendipity sure is nice.  ;-)
>
>
>> We've discussed enabling the versioning path unconditionally for example.
>
>
> It might make sense even without vectorization, for target arch.s with [1]
> either a wide range of
> predicated instructions or at least [a] usable "cmove-like" instruction[s]
> _and_ [2] the target either
> _never_ runs off battery or only extremely-rarely runs off battery.  When
> running off "wall outlet"
> electricity [and not caring about the electric bill ;-)], wasted speculation
> is [usually?] just
> wasted energy that didn`t cost any extra time b/c the CPU/core would have
> been idle otherwise.
> When running off battery, the wasted energy _can_ be unacceptable, but is
> not _necessarily_ so:
> it depends on the customer/programmer/user`s priorities, esp. vis-a-vis
> execution speed vs.
> amount of time a single charge allows the machine to run.

Well, but we do have a pretty strong if-converter on RTL which has access
to target specific information.

> The preceding makes me wonder: has anybody considered adding an optimization
> profile for GCC,
> to add to the set {"-O"..."-O3", "-Ofast", "-Os"}, that optimizes for the
> amount of energy
> consumed? I don`t remember reading about anything like that in relation to
> compiler research,
> but perhaps somebody reading this _has_ seen [or done!] something related
> and would kindly reply.
> Obviously, this is not an easy thing to figure out, since in _most_ cases
> finishing the job
> sooner -- i.e. running faster -- means less energy spent computing the job
> than would have
> otherwise been the case, but this is not _always_ true: for example,
> speculative execution
> that has a 50% probability of being wasteful instead of just idling in a
> low-power state.

I think there were GCC summit papers/talks about this.

>
>> So if the new scheme with scratch-pads produces more "correct" code but
>> code
>
>> that will known to fail vectorization then it's done at the wrong place -
>> because the whole purpose of GIMPLE if-conversion is to enable more
>> vectorization.
>
> I think I understand, and I agree.  The purpose of this pass is enable more
> vectorization — the recently-reported fact that it can also enable more
> cmove-style
> non-vectorized code can also be beneficial, but is not the main objective.
>
> The main benefit of the new if converter is not vs. "GCC without any if
> conversion",
> but rather is vs. the _old_ if converter.  The old one can, in some cases,
> produce code that e.g. dereferences a null pointer when the same program
> given
> the same inputs would have not done so without the if-conversion
> "optimization".

Testcase?  I don't think it can and if it can this bug needs to be fixed.

> The new converter reduces/eliminates this problem.  Therefor, my current
> main goal is to eliminate the performance regressions that are not spurious
> [e.g. are not a direct result of the old conversion being unsafe],
> so that the new converter can be merged to trunk and also enabled implicitly
> by "-O3" for autovectorization-enabled arch.es, which the old converter
> AFAIK was _not_ [due to the aforementioned safety issues].

You mean the -ftree-loop-if-convert-stores path.

> In other words, the old if converter was like a sharp knife with a very
> small
> handle: usable by experts, but dangerous for people with little knowledge of
> the run-time properties of the code [e.g. will a pointer ever be null?] who
> just want to pass in "-O3" and have the code run faster without much
> thinking.
> A typical GCC user: "This code runs fine when compiled with ''-O1''
> and ''-O2'', so with ''-O3'' it should also be fine, only faster!"
>
> IMO, only those flags that _explicitly_ request unsafe transformations
> should be allowed
> to cause {source code that runs perfectly when compiled with a low
> optimization setting}
> to be compiled to code that may crash or may compute a different result than
> under a
> low-optimization setting [e.g. compiling floating-point code such that the
> executable
> ignores NaNs or equates denorms with zero] even when given the same inputs
> as a
> non-crashing correct-result-producing less-optimized build of the same
> source.
> AFAIK this is in accordance with GCC`s philosophy, which explains why the
> old
> if converter was not enabled by default. The _new_ if converter, OTOH, is
> safe
> enough to enable by default under "-O3", and should be beneficial for
> targets
> that support vector operations and for which the autovectorizer is
> successful
> in generating vector code.  Those are probably the main reasons why the new
> converter is worth hacking on to get it into shape,
> performance-regression-wise.
> Plus, this is work that my employer [Samsung] is willing and able
> to fund at this time [by paying my salary while I work on it ;-)].

Sure.

> Regards,
>
> Abe

Reply via email to