http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59195

            Bug ID: 59195
           Summary: C++ demangler handles conversion operator incorrectly
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ccoutant at gcc dot gnu.org

Consider this simple source:

  struct C {
    C(int i_) : i(i_) { }
    int i;
  };

  struct A {
    A() : i(0) { }
    template <typename U>
    explicit operator U();
    int i;
  };

  void foo()
  {
    A ai;
    C c = (C)ai;
  }

The conversion operator in A mangles to "_ZN1AcvT_I1CEEv", which
demangles to:

  A::operator C<C>()

(I think this would be better as "A::operator C()", but let's
leave that aside for now.)

That looks OK, but not if you dig deeper. The demangler is
parsing this name as follows:

  typed name
    qualified name
      name 'A'
      cast
    template
      template parameter 0
      template argument list
        name 'C'
    function type
      argument list
  A::operator C<C>()

It's parsing the "T_" as a <template-template-param>, not a
<template-param>. It should really be parsing it as:

  typed name
    template
      qualified name
    name 'A'
    cast
      template parameter 0
      template argument list
    name 'C'
    function type
      argument list
  A::operator C<C>()

The template argument list actually belongs with the
<qualified-name>, not with the template parameter.

The basic problem is an ambiguity in the grammar for mangled
names. Consider the following two derivations:

  (1) <nested-name>
      -> <template-prefix> <template-args>
      -> <prefix> <template-unqualified-name> <template-args>
      -> <unqualified-name> <template-unqualified-name>
                                                   <template-args>
      -> <source-name> <template-unqualified-name> <template-args>
      -> <source-name> <operator-name> <template-args>
      -> <source-name> cv <type> <template-args>
      -> <source-name> cv <template-template-param>
                                   <template-args> <template-args>

  (2) <nested-name>
      -> <template-prefix> <template-args>
      -> <prefix> <template-unqualified-name> <template-args>
      -> <unqualified-name> <template-unqualified-name>
                                                   <template-args>
      -> <source-name> <template-unqualified-name> <template-args>
      -> <source-name> <operator-name> <template-args>
      -> <source-name> cv <type> <template-args>
      -> <source-name> cv <template-param> <template-args>

When you get to the "T_", there's no way to know if it's a
<template-param> or a <template-template-param>. The parser in
cp-demangle.c, in cplus_demangle_type, looks ahead to
disambiguate the two. If it sees an 'I', it assumes that it's a
<template-template-param>, and it greedily consumes the
<template-args>.

In this particular case, it's wrong, but it gets the right answer
because of a hack in d_print_cast, which takes the template
arguments under the cast operator, and places them in scope
before resolving the "T_" reference.

It doesn't take much to break that hack, though. Let's throw in a
pointer:

  struct C {
    C(int i_) : i(i_) { }
    int i;
  };

  struct A {
    A() : i(0) { }
    template <typename U>
    explicit operator U*();
    int i;
  };

  void foo()
  {
    A ai;
    C* c = (C*)ai;
  }

Now we get the mangled name "_ZN1AcvPT_I1CEEv", which the
demangler fails on:

  typed name
    qualified name
      name 'A'
      cast
    pointer
      template
        template parameter 0
        template argument list
          name 'C'
    function type
      argument list
  Failed: _ZN1AcvPT_I1CEEv

This name ought to be parsed as follows:

  typed name
    template
      qualified name
    name 'A'
    cast
      pointer
        template parameter 0
      template argument list
    name 'C'
    function type
      argument list
  A::operator C*<C>()

(This incorrect parsing can also lead to some other subtle types
of failures, because substitutions can be numbered in the wrong
order. See the long example that started me on this investigation
at the end.)

Now let's add a real template template parameter:

  template <typename T>
  struct C {
    C(T i_) : i(i_) { }
    T i;
  };

  struct A {
    A() : i(0) { }
    template <template<typename U> class V>
    operator V<int>();
    int i;
  };

  void foo()
  {
    A ai;
    C<int> c = (C<int>)ai;
  }

The conversion operator is now "_ZN1AcvT_IiEI1CEEv", and the
demangler gives us this:

  typed name
    template
      qualified name
    name 'A'
    cast
      template
        template parameter 0
        template argument list
          builtin type int
      template argument list
    name 'C'
    function type
      argument list
  A::operator int<int><C>()

The derivation is actually correct this time, because we really
do have a <template-template-param>, but the hack in d_print_cast
causes it to substitute the wrong type for "T_".

I'm pretty sure that the ambiguity can be resolved by context.
Note that if we have a <template-template-param>, the
<template-args> that go with it are followed by another set of
<template-args> that go with the <template-prefix> in the second
step of the derivation. (This is only true if we're parsing a
conversion operator that derives from <nested-name>; if it's a
conversion operator that's part of an <expression>, the presence
of an 'I' should always indicate that we're looking at a
<template-template-param>.)

I've got a patch that solves this with a little backtracking. If
we're in a conversion operator that's not part of an
<expression>, and we see an 'I', it checkpoints the d_info
structure, does a trial parse of <template-args>, then looks for
another 'I'. If there isn't one, it backtracks and returns a
<template-param>, leaving the <template-args> for d_prefix to
parse.

When printing a conversion operator, in d_print_cast, if the
operator is derived from a <template>, we need to place that
template in scope so that the <template-param> can look up the
correct type. We can't just add it to the stack of templates,
though, since that's only applicable when printing the type of a
function whose name is a template.

Here's the long example that started me down this path:

 
_ZNK7strings8internal8SplitterINS_9delimiter5AnyOfENS_9SkipEmptyEEcvT_ISt6vectorI12basic_stringIcSt11char_traitsIcESaIcEESaISD_EEvEEv

The demangler fails to parse this string, because the "SD_" near
the end is asking for the 15th substitution, but at that point it
has only registered 14 substitutions. That happened because it
greedily consumed the <template-args> that follow "cvT_", and
hadn't yet registered the substitution for the conversion
operator (which derives from <prefix>). Any substitutions that
are contained in the <template-args> will be off by one as a
result.

Reply via email to