Re: PR83750: CSE erf/erfc pair

Prathamesh Kulkarni Thu, 08 Nov 2018 23:12:02 -0800

On Tue, 6 Nov 2018 at 16:04, Richard Biener <richard.guent...@gmail.com> wrote:
>
> On Mon, Nov 5, 2018 at 3:11 PM Prathamesh Kulkarni
> <prathamesh.kulka...@linaro.org> wrote:
> >
> > On Mon, 5 Nov 2018 at 18:14, Richard Biener <richard.guent...@gmail.com> 
> > wrote:
> > >
> > > On Mon, Nov 5, 2018 at 1:11 PM Prathamesh Kulkarni
> > > <prathamesh.kulka...@linaro.org> wrote:
> > > >
> > > > On Mon, 5 Nov 2018 at 15:10, Richard Biener 
> > > > <richard.guent...@gmail.com> wrote:
> > > > >
> > > > > On Fri, Nov 2, 2018 at 10:37 AM Prathamesh Kulkarni
> > > > > <prathamesh.kulka...@linaro.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > > This patch adds two transforms to match.pd to CSE erf/erfc pair.
> > > > > > erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
> > > > > > erf(x) when canonicalization is disabled and result of erf(x) has
> > > > > > single use within 1 - erf(x).
> > > > > >
> > > > > > The patch regressed builtin-nonneg-1.c. The following test-case
> > > > > > reproduces the issue with patch:
> > > > > >
> > > > > > void test(double d1) {
> > > > > >   if (signbit(erfc(d1)))
> > > > > >     link_failure_erfc();
> > > > > > }
> > > > > >
> > > > > > ssa dump:
> > > > > >
> > > > > >   <bb 2> :
> > > > > >   _5 = __builtin_erf (d1_4(D));
> > > > > >   _1 = 1.0e+0 - _5;
> > > > > >   _6 = _1 < 0.0;
> > > > > >   _2 = (int) _6;
> > > > > >   if (_2 != 0)
> > > > > >     goto <bb 3>; [INV]
> > > > > >   else
> > > > > >     goto <bb 4>; [INV]
> > > > > >
> > > > > >   <bb 3> :
> > > > > >   link_failure_erfc ();
> > > > > >
> > > > > >   <bb 4> :
> > > > > >   return;
> > > > > >
> > > > > > As can be seen, erfc(d1) is folded to 1 - erf(d1).
> > > > > > forwprop then transforms the if condition from _2 != 0
> > > > > > to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
> > > > > > in undefined reference to link_failure_erfc().
> > > > > >
> > > > > > So, the patch adds another transform erf(x) > 1 -> 0.
> > > > >
> > > > > Ick.
> > > > >
> > > > > Why not canonicalize erf (x) to 1-erfc(x) instead?
> > > > Sorry I didn't quite follow, won't this cause similar issue with erf ?
> > > > I changed the pattern to canonicalize erf(x) -> 1 - erfc(x)
> > > > and 1 - erfc(x) -> erf(x) after canonicalization is disabled.
> > > >
> > > > This caused undefined reference to link_failure_erf() in following 
> > > > test-case:
> > > >
> > > > extern int signbit(double);
> > > > extern void link_failure_erf(void);
> > > > extern double erf(double);
> > > >
> > > > void test(double d1) {
> > > >   if (signbit(erf(d1)))
> > > >     link_failure_erf();
> > > > }
> > >
> > > But that's already not optimized without any canonicalization
> > > because erf returns sth in range [-1, 1].
> > >
> > > I suggested the change because we have limited support for FP
> > > value-ranges and nonnegative is one thing we can compute
> > > (and erfc as opposed to erf is nonnegative).
> > Ah right, thanks for the explanation.
> > Unfortunately this still regresses builtin-nonneg-1.c, which can be
> > reproduced with following test-case:
> >
> > extern int signbit(double);
> > extern void link_failure_erf(void);
> > extern double erf(double);
> > extern double fabs(double);
> >
> > void test(double d1) {
> >   if (signbit(erf(fabs(d1))))
> >     link_failure_erf();
> > }
> >
> > signbit(erf(fabs(d1)) is transformed to 0 without patch but with patch
> > it gets canonicalized to signbit(1 - erfc(fabs(d1))) which similarly
> > defeats DCE.
> >
> > forwprop1 shows:
> > <bb 2> :
> >   _1 = ABS_EXPR <d1_5(D)>;
> >   _6 = __builtin_erfc (_1);
> >   _2 = 1.0e+0 - _6;
> >   _7 = _6 > 1.0e+0;
> >   _3 = (int) _7;
> >   if (_6 > 1.0e+0)
> >     goto <bb 3>; [INV]
> >   else
> >     goto <bb 4>; [INV]
> >
> >   <bb 3> :
> >   link_failure_erf ();
> >
> >   <bb 4> :
> >   return;
> >
> > I assume we would need to somehow tell gcc that the canonicalized
> > expression 1 - erfc(x) would not exceed 1.0 ?
> > Is there a better way to do that apart from defining pattern (1 -
> > erfc(x)) > 1.0 -> 0
> > which I agree doesn't look ideal to add in match.pd ?
>
> You could handle a MINUS_EXPR of 1 and erfc() in
> gimple_assign_nonnegative_warnv_p (I wouldn't bother
> to do it for tree_binary_nonnegative_warnv_p)
>
Um won't the value range of 1 - erfc(x) be [-1.0, 1.0] (same as
erf(x)), so would adding it to gimple_assign_nonnegative_warnv_p be
correct ?
erf(fabs(x)) is non-negative and between [0.0, 1.0] although I am not
sure if 1 - erfc(fabs(x)) would be always non-negative too if
erfc(fabs(x)) can exceed 1.0 for some value of x ?
AFAIU for the above case, gcc doesn't know that 1 - erfc(x) cannot
exceed 1.0, and that's because we don't propagate value range for
floats as you pointed out.


Thanks,
Prathamesh
> This is of course similarly "lame" but a bit cleaner than
> a match.pd pattern that just works for the comparison.
>
> In reality the proper long-term fix is to add basic range
> propagation to floats.
>
> Richard.
>
> >
> > Thanks
> > Prathamesh
> > >
> > > > forwprop1 shows:
> > > >
> > > >    <bb 2> :
> > > >   _5 = __builtin_erfc (d1_4(D));
> > > >   _1 = 1.0e+0 - _5;
> > > >   _6 = _5 > 1.0e+0;
> > > >   _2 = (int) _6;
> > > >   if (_5 > 1.0e+0)
> > > >     goto <bb 3>; [INV]
> > > >   else
> > > >     goto <bb 4>; [INV]
> > > >
> > > >   <bb 3> :
> > > >   link_failure_erf ();
> > > >
> > > >   <bb 4> :
> > > >   return;
> > > >
> > > > which defeats DCE to remove call to link_failure_erf.
> > > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > > which resolves the regression.
> > > > > >
> > > > > > Bootstrapped+tested on x86_64-unknown-linux-gnu.
> > > > > > Cross-testing on arm and aarch64 variants in progress.
> > > > > > OK for trunk if passes ?
> > > > > >
> > > > > > Thanks,
> > > > > > Prathamesh

Re: PR83750: CSE erf/erfc pair

Reply via email to