[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)

2019-01-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 CC||rguenth at gcc dot gnu.org
  Component|c++ |libstdc++

--- Comment #1 from Richard Biener  ---
This is because it still needs to generate the std::string objects at the
caller
site (outside of the if (print)).  This involves quite some code to get
rid of, and even at -O3 we do not inline basic_string::basic_string it seems
(ISTR that is out-of-line in the library):

  __asm__ __volatile__("mfence" :  :  : "memory");
  _6 = MEM[(const int *)&data + 4B];
  if (_6 > 0)
goto ; [41.48%]
  else
goto ; [58.52%]

   [local count: 445388109]:
  std::basic_string::basic_string (&D.39204, "<", &D.39205);
  _7 = MEM[(char * *)&D.39204];
  _8 = _7 + 18446744073709551592;
  if (_8 != &_S_empty_rep_storage)
goto ; [10.00%]
  else
goto ; [90.00%]

   [local count: 434030711]:
  goto ; [100.00%]

   [local count: 44538811]:
  if (__gthrw___pthread_key_create != 0B)
goto ; [53.47%]
  else
goto ; [46.53%]

   [local count: 23814902]:
  _9 = &MEM[(struct _Rep *)_7 + -24B].D.23940._M_refcount;
  _10 = __atomic_fetch_add_4 (_9, 4294967295, 4);
  _11 = (int) _10;
  goto ; [100.00%]

   [local count: 20723909]:
  __result_12 = MEM[(_Atomic_word *)_7 + -8B];
  _13 = __result_12 + -1;
  MEM[(_Atomic_word *)_7 + -8B] = _13;

   [local count: 44538811]:
  # _14 = PHI <_11(6), __result_12(7)>
  if (_14 <= 0)
goto ; [25.50%]
  else
goto ; [74.50%]

   [local count: 11357397]:
  std::basic_string::_Rep::_M_destroy (_8, &D.39206);

   [local count: 445388108]:
  D.39206 ={v} {CLOBBER};
  D.39204 ={v} {CLOBBER};
  D.39205 ={v} {CLOBBER};

   [local count: 1073741825]:
  __asm__ __volatile__("mfence" :  :  : "memory");
  data ={v} {CLOBBER};

[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)

2019-01-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971

--- Comment #2 from Jonathan Wakely  ---
(In reply to Richard Biener from comment #1)
> rid of, and even at -O3 we do not inline basic_string::basic_string it seems
> (ISTR that is out-of-line in the library):

There's an explicit instantiation in the library, but the definition is inline
in the headers. If the compiler wanted to inline it, all the code is visible
and nothing forces it to use the explicit instantiation in the library.

[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)

2019-01-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971

--- Comment #3 from Jonathan Wakely  ---
(In reply to Richard Biener from comment #1)
> This is because it still needs to generate the std::string objects at the
> caller
> site (outside of the if (print)).  This involves quite some code to get
> rid of, and even at -O3 we do not inline basic_string::basic_string it seems
> (ISTR that is out-of-line in the library):
> 
>   __asm__ __volatile__("mfence" :  :  : "memory");
>   _6 = MEM[(const int *)&data + 4B];
>   if (_6 > 0)
> goto ; [41.48%]
>   else
> goto ; [58.52%]
> 
>[local count: 445388109]:
>   std::basic_string::basic_string (&D.39204, "<", &D.39205);
>   _7 = MEM[(char * *)&D.39204];
>   _8 = _7 + 18446744073709551592;
>   if (_8 != &_S_empty_rep_storage)
> goto ; [10.00%]
>   else
> goto ; [90.00%]

Looks like you're using -D_GLIBCXX_USE_CXX11_ABI=0 but the OP is not.

[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)

2019-01-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971

Richard Biener  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #4 from Richard Biener  ---
(In reply to Jonathan Wakely from comment #3)
> (In reply to Richard Biener from comment #1)
> > This is because it still needs to generate the std::string objects at the
> > caller
> > site (outside of the if (print)).  This involves quite some code to get
> > rid of, and even at -O3 we do not inline basic_string::basic_string it seems
> > (ISTR that is out-of-line in the library):
> > 
> >   __asm__ __volatile__("mfence" :  :  : "memory");
> >   _6 = MEM[(const int *)&data + 4B];
> >   if (_6 > 0)
> > goto ; [41.48%]
> >   else
> > goto ; [58.52%]
> > 
> >[local count: 445388109]:
> >   std::basic_string::basic_string (&D.39204, "<", &D.39205);
> >   _7 = MEM[(char * *)&D.39204];
> >   _8 = _7 + 18446744073709551592;
> >   if (_8 != &_S_empty_rep_storage)
> > goto ; [10.00%]
> >   else
> > goto ; [90.00%]
> 
> Looks like you're using -D_GLIBCXX_USE_CXX11_ABI=0 but the OP is not.

Indeed.  It's still missed inlining that makes elding of the argument
construction difficult.  Looks like

std::__cxx11::basic_string::_M_construct

is not marked inline (so needs -finline-functions to get IPA inlining
processing).  Indeed I see

  // For forward_iterators up to random_access_iterators, used for
  // string::iterator, _CharT*, etc.
  template
void
_M_construct(_FwdIterator __beg, _FwdIterator __end,
 std::forward_iterator_tag);

and others.

[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)

2019-01-22 Thread maratrus at mail dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971

--- Comment #5 from Marat Stanichenko  ---

(In reply to Richard Biener from comment #1)
> This is because it still needs to generate the std::string objects at the
> caller

Thank you very much for the comment! That is probably why I had to add another
string parameter to the function `PrintBad` in the example provided to trigger
this behaviour. The fact is that in the production environment where I managed
to spot this, the compiler could not optimize the function that is very similar
to `PrintGood` and has just a single string parameter. I didn't manage to
reproduce it when simplifying things while preparing a test-case here.

Do you believe that compiler can do better in such situations? Or the current
behaviour is perfectly valid and no improvements are really needed?

[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)

2019-01-22 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971

--- Comment #6 from rguenther at suse dot de  ---
On Tue, 22 Jan 2019, maratrus at mail dot ru wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971
> 
> --- Comment #5 from Marat Stanichenko  ---
> 
> (In reply to Richard Biener from comment #1)
> > This is because it still needs to generate the std::string objects at the
> > caller
> 
> Thank you very much for the comment! That is probably why I had to add another
> string parameter to the function `PrintBad` in the example provided to trigger
> this behaviour. The fact is that in the production environment where I managed
> to spot this, the compiler could not optimize the function that is very 
> similar
> to `PrintGood` and has just a single string parameter. I didn't manage to
> reproduce it when simplifying things while preparing a test-case here.
> 
> Do you believe that compiler can do better in such situations? Or the current
> behaviour is perfectly valid and no improvements are really needed?

The compiler can of course do better when estimating the benefit of
inlining.  It's just not entirely clear if it is reasonably easy to
do so ... [let aside the -finline-function issue]

[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)

2019-01-22 Thread maratrus at mail dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971

--- Comment #7 from Marat Stanichenko  ---
(In reply to rguent...@suse.de from comment #6)
> > Do you believe that compiler can do better in such situations? Or the 
> > current
> > behaviour is perfectly valid and no improvements are really needed?
> 
> The compiler can of course do better when estimating the benefit of
> inlining.  It's just not entirely clear if it is reasonably easy to
> do so ... [let aside the -finline-function issue]

Is it only about inlining?

Unfortunately, I cannot evaluate the technical difficulties but in patterns
like presented here i.e.

```
if (condition)
  Function(param);
```

irrespective of the fact whether `Function()` is asked to be inlined or not
there are at least two observations that the compiler may notice before taking
an optimization decision:

a) Whether `Function(param)` is empty or not. In both scenarios in the example
presented `PrintBad("<", ">", t)` and `PrintGood("<", t)` are
actually empty.

b) Whether `param` is used in a `Function`. I think I can see the examples when
function calls like `PrintBad(">", t)` generates a code to construct
parameter ">" despite the fact that not only it is not used but also the
function body is empty.

Of course, ideally I expect the parameter not to be constructed if it is not
used and the whole branch to be eliminated of the `Function` is empty and
`condition` does not have side effects. But as I said, I do not have enough
technical expertise to evaluate the cost. I speak from a solely user's
perspective.

What do you think is the best way to solve this? Keep tracking examples until
some critical mass is collected?

[Bug libstdc++/88971] Branch optimization inconsistency (missed optimization)

2019-01-22 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971

--- Comment #8 from rguenther at suse dot de  ---
On Tue, 22 Jan 2019, maratrus at mail dot ru wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971
> 
> --- Comment #7 from Marat Stanichenko  ---
> (In reply to rguent...@suse.de from comment #6)
> > > Do you believe that compiler can do better in such situations? Or the 
> > > current
> > > behaviour is perfectly valid and no improvements are really needed?
> > 
> > The compiler can of course do better when estimating the benefit of
> > inlining.  It's just not entirely clear if it is reasonably easy to
> > do so ... [let aside the -finline-function issue]
> 
> Is it only about inlining?
> 
> Unfortunately, I cannot evaluate the technical difficulties but in patterns
> like presented here i.e.
> 
> ```
> if (condition)
>   Function(param);
> ```
> 
> irrespective of the fact whether `Function()` is asked to be inlined or not
> there are at least two observations that the compiler may notice before taking
> an optimization decision:
> 
> a) Whether `Function(param)` is empty or not. In both scenarios in the example
> presented `PrintBad("<", ">", t)` and `PrintGood("<", t)` are
> actually empty.
> 
> b) Whether `param` is used in a `Function`. I think I can see the examples 
> when
> function calls like `PrintBad(">", t)` generates a code to construct
> parameter ">" despite the fact that not only it is not used but also the
> function body is empty.
> 
> Of course, ideally I expect the parameter not to be constructed if it is not
> used and the whole branch to be eliminated of the `Function` is empty and
> `condition` does not have side effects. But as I said, I do not have enough
> technical expertise to evaluate the cost. I speak from a solely user's
> perspective.
> 
> What do you think is the best way to solve this? Keep tracking examples until
> some critical mass is collected?

Well, it's more until any bright idea pops up how to solve this
abstraction issue...