And also re-attaching the patch! On Fri, 17 Sept 2021 at 23:17, Anthony Sharp <anthonyshar...@gmail.com> wrote: > > Re-adding gcc-patches@gcc.gnu.org. > > ---------- Forwarded message --------- > From: Anthony Sharp <anthonyshar...@gmail.com> > Date: Fri, 17 Sept 2021 at 23:11 > Subject: Re: [PATCH] c++: error message for dependent template members > [PR70417] > To: Jason Merrill <ja...@redhat.com> > > > Hi Jason! Apologies for the delay. > > > This is basically core issue 1835, http://wg21.link/cwg1835 > > > This was changed for C++23 by the paper "Declarations and where to find > > them", http://wg21.link/p1787 > > Interesting, I was not aware of that. I was very vaguely aware that a > template-id in a class member access expression could be found by > ordinary lookup (very bottom of here > https://en.cppreference.com/w/cpp/language/dependent_name), but it's > interesting to see it is deeper than I realised. > > > But in either case, whether create<U> is in a dependent scope depends on > > how we resolve impl::, we don't need to remember further back in the > > expression. So your dependent_expression_p parameter seems like the > > wrong approach. Note that when we're looking up the name after ->, the > > type of the object expression is in parser->context->object_type. > > That's true. I think my thinking was that since it already got figured > out in cp_parser_postfix_dot_deref_expression, which is where . and -> > access expressions come from, I thought I might as well pass it > through, since it seemed to work. But looking again, you're right, > it's not really worth the hassle; might as well just call > dependent_scope_p again. > > > The cases you fixed in symbol-summary.h are indeed mistakes, but not > > ill-formed, so giving an error on them is wrong. For example, here is a > > well-formed program that is rejected with your patch: > > > template <class T, int N> void f(T t) { t.m<N>(0); } > > struct A { int m; } a; > > int main() { f<A,42>(a); } > > I suppose there was always going to be edge-cases when doing it the > way I've done. But yes, it can be worked-around by making it a warning > instead. Interestingly Clang doesn't trip up on that example, so I > guess they must be examining it some other way (e.g. at instantiation > time) - but that approach perhaps misses out on the slight performance > improvement this seems to bring. > > > Now that we're writing C++, I'd prefer to avoid this kind of pattern in > > favor of RAII, such as saved_token_sentinel. If this is still relevant > > after addressing the above comments. > > Sorry, it's the junior developer in me showing! So this confused me at > first. After having mucked around a bit I tried using > saved_token_sentinel but didn't see any benefit since it doesn't > rollback on going out of scope, and I'll always want to rollback. I > can call rollback directly, but then I might as well save and restore > myself. So what I did was use it but also modify it slightly to > rollback by default on going out of scope (in my mind that makes more > sense, since if something goes wrong you wouldn't normally want to > commit anything that happened [edit 1: unless committing was part of > the whole sanity checking thing] [edit 2: well I guess you could also > argue that since this is a parser afterall, we like to KEEP things > sometimes]). But anyways, I made this configurable; it now has three > modes - roll-back, commit or do nothing. Let me know if you think > that's not the way to go. > > > This code doesn't handle skipping matched ()/{}/[] in the > > template-argument-list. You probably want to involve > > cp_parser_skip_to_end_of_template_parameter_list somehow. > > Good point. It required some refactoring, but I have used it. Also, > just putting it out there, this line from > cp_parser_skip_to_end_of_template_parameter_list makes zero sense to > me (why throw an error OR immediately return?), but I have worked > around it, since it seems to break without it: > > > /* Are we ready, yet? If not, issue error message. */ > > if (cp_parser_require (parser, CPP_GREATER, RT_GREATER)) > > return false; > > Last thing - I initially made a mistake. I put something like: > > (next_token->type == CPP_NAME > && MAYBE_CLASS_TYPE_P (parser->scope) > && !constructor_name_p (cp_expr (next_token->u.value, > > next_token->location), > parser->scope)) > > Instead of: > > !(next_token->type == CPP_NAME > && MAYBE_CLASS_TYPE_P (parser->scope) > && constructor_name_p (cp_expr (next_token->u.value, > > next_token->location), > parser->scope)) > > This meant a lot of things were being excluded that weren't supposed > to be. Oops! Changing this opened up a whole new can of worms, so I > had to make some changes to the main logic, but it just a little bit > in the end. > > Regtested everything again and all seems fine. Bootstraps fine. Patch > attached. Let me know if it needs anything else. > > Thanks for the help, > Anthony > > > > On Fri, 27 Aug 2021 at 22:33, Jason Merrill <ja...@redhat.com> wrote: > > > > On 8/20/21 12:56 PM, Anthony Sharp via Gcc-patches wrote: > > > Hi, hope everyone is well. I have a patch here for issue 70417 > > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70417). I'm still a GCC > > > noob, and this is probably the hardest thing I have ever coded in my > > > life, so please forgive any initial mistakes! > > > > > > TLDR > > > > > > This patch introduces a helpful error message when the user attempts > > > to put a template-id after a member access token (., -> or ::) in a > > > dependent context without having put the "template" keyword after the > > > access token. > > > > Great, thanks! > > > > > CONTEXT > > > > > > In C++, when referencing a class member (using ., -> or ::) in a > > > dependent context, if that member is a template, the template keyword > > > is required after the access token. For example: > > > > > > template<class T> void MyDependentTemplate(T t) > > > { > > > t.DoSomething<T>(); /* Error - t is dependent. "<" is treated as a > > > less-than sign because DoSomething is assumed to be a non-template > > > identifier. */ > > > t.template DoSomething<T>(); // Good. > > > } > > > > > > PATCH EXPLANATION > > > > > > In order to throw an appropriate error message we need to identify > > > and/or create a point in the compiler where the following conditions > > > are all simultaneously satisfied: > > > > > > A) a class member access token (., ->, ::) > > > B) a dependent context > > > C) the thing being accessed is a template-id > > > D) No template keyword > > > > > > A, B and D are relatively straightforward and I won't go into detail > > > about how that was achieved. The real challenge is C - figuring out > > > whether a name is a template-id. > > > > > > Usually, we would look up the name and use that to determine whether > > > the name is a template or not. But, we cannot do that. We are in a > > > dependent context, so lookup cannot (in theory) find anything at the > > > point the access expression is parsed. > > > > > > We (maybe) could defer checking until the template is actually > > > instantiated. But this is not the approach I have gone for, since this > > > to me sounded unnecessarily awkward and clunky. In my mind this also > > > seems like a syntax error, and IMO it seems more natural that syntax > > > errors should get caught as soon as they are parsed. > > > > > > So, the solution I went for was to figure out whether the name was a > > > template by looking at the surrounding tokens. The key to this lies in > > > being able to differentiate between the start and end of a template > > > parameter list (< and >) and greater-than and less-than tokens (and > > > perhaps also things like << or >>). If the final > (if we find one at > > > all) does indeed mark the end of a class or function template, then it > > > will be followed by something like (, ::, or even just ;. On the other > > > hand, a greater-than operator would be followed by a name. > > > > > > PERFORMANCE IMPACT > > > > > > My concern with this approach was that checking this would actually > > > slow the compiler down. In the end it seemed the opposite was true. By > > > parsing the tokens to determine whether the name looks like a > > > template-id, we can cut out a lot of the template-checking code that > > > gets run trying to find a template when there is none, making compile > > > time generally faster. That being said, I'm not sure if the > > > improvement will be that substantial, so I did not feel it necessary > > > to introduce the template checking method everywhere where we could > > > possibly have run into a template. > > > > > > I ran an ABAB test with the following code repeated enough times to > > > fill up about 10,000 lines: > > > > > > ai = 1; > > > bi = 2; > > > ci = 3; > > > c.x = 4; > > > (&c)->x = 5; > > > mytemplateclass<Y>::y = 6; > > > c.template class_func<Y>(); > > > (&c)->template class_func<Y>(); > > > mytemplateclass<Y>::template class_func_static<Y>(); > > > c2.class_func<myclass>(); > > > (&c2)->class_func<myclass>(); > > > myclass::class_func_static<myclass>(); > > > ai = 6; > > > bi = 5; > > > ci = 4; > > > c.x = 3; > > > (&c)->x = 2; > > > mytemplateclass<Y>::y = 1; > > > c.template class_func<Y>(); > > > (&c)->template class_func<Y>(); > > > mytemplateclass<Y>::template class_func_static<Y>(); > > > c2.class_func<myclass>(); > > > (&c2)->class_func<myclass>(); > > > myclass::class_func_static<myclass>(); > > > > > > It basically just contains a mixture of class access expressions plus > > > some regular assignments for good measure. The compiler was compiled > > > without any optimisations (and so slower than usual). In hindsight, > > > perhaps this test case assumes the worst-ish-case scenario since it > > > contains a lot of templates; most of the benefits of this patch occur > > > when parsing non-templates. > > > > > > The compiler (clean) and the compiler with the patch applied (patched) > > > compiled the above code 20 times each in ABAB fashion. On average, the > > > clean compiler took 2.26295 seconds and the patched compiler took > > > 2.25145 seconds (about 0.508% faster). Whether this improvement > > > remains or disappears when the compiler is built with optimisations > > > turned on I haven't tested. My main concern was just making sure it > > > didn't get any slower. > > > > > > AWKWARD TEST CASES > > > > > > You will see from the patch that it required the updating of a few > > > testcases (as well as one place within the compiler). I believe this > > > is because up until now, the compiler allowed the omission of the > > > template keyword in some places it shouldn't have. Take decltype34.C > > > for example: > > > > > > struct impl > > > { > > > template <class T> static T create(); > > > }; > > > > > > template<class T, class U, class = > > > decltype(impl::create<T>()->impl::create<U>())> > > > struct tester{}; > > > > > > GCC currently accepts this code. My patch causes it to be rejected. > > > > > For what it's worth, Clang also rejects this code: > > > > > > 1824786093/source.cpp:6:70: error: use 'template' keyword to treat > > > 'create' as a dependent template name > > > template<class T, class U, class = > > > decltype(impl::create<T>()->impl::create<U>())> > > > > > > ^ template > > > > > > I think Clang is correct since from what I understand it should be > > > written as impl::create<T>()->impl::template create<U>(). Here, > > > create<T>() is dependent, so it makes sense that we would need > > > "template" before the scope operator. Then again, I could well be > > > wrong. The rules around dependent template names are pretty crazy. > > > > This is basically core issue 1835, http://wg21.link/cwg1835 > > > > This was changed for C++23 by the paper "Declarations and where to find > > them", http://wg21.link/p1787 > > > > Previously, e.g. in C++20, > > https://timsong-cpp.github.io/cppwp/n4861/basic.lookup.classref#4 said > > that when we see ->impl, since we can't look up the name in the > > (dependent) object type, we do unqualified lookup and find ::impl, so > > create<U> is not in a dependent scope, and the testcase is fine. > > > > The 1835 change clarifies that we only do unqualified lookup of the > > second impl if the object type is non-dependent, so the second create is > > in a dependent scope, and we need the ::template. > > > > But in either case, whether create<U> is in a dependent scope depends on > > how we resolve impl::, we don't need to remember further back in the > > expression. So your dependent_expression_p parameter seems like the > > wrong approach. Note that when we're looking up the name after ->, the > > type of the object expression is in parser->context->object_type. > > > > 1835 also makes the following testcase valid, which we don't currently > > accept, but clang does: > > > > template <class T> void f(T t) { t.foo::bar(); } > > struct foo { void bar(); }; > > int main() { f(foo()); } > > > > For C++20 and down, I want to start accepting this testcase, but also > > still accept decltype34.C to avoid breaking existing code. But that's a > > separate issue; I don't think your patch should affect decltype34. > > > > The cases you fixed in symbol-summary.h are indeed mistakes, but not > > ill-formed, so giving an error on them is wrong. For example, here is a > > well-formed program that is rejected with your patch: > > > > template <class T, int N> void f(T t) { t.m<N>(0); } > > struct A { int m; } a; > > int main() { f<A,42>(a); } > > > > Comparing the result of < by > is very unlikely to be what the user > > actually meant, but it is potentially valid. So we should accept f with > > a warning about the dubious expression. People can silence the warning > > if they really mean this by parenthesizing the LHS of the < operator. > > > > Another valid program, using :: > > > > int x; > > template <class T, int N> void f(T t) { t.m<N>::x; } > > struct A { int m; } a; > > int main() { f<A,42>(a); } > > > > Now to reviewing the actual patch: > > > > > + yet). Return 1 if it does look like a template-id. Return 2 > > > + if not. */ > > > > Let's use -1 for definitely not. > > > > > + /* Could be a ~ referencing the destructor of a class template. > > > + Temporarily eat it up so it's not in our way. */ > > > + if (next_token->type == CPP_COMPL) > > > + { > > > + cp_lexer_save_tokens (parser->lexer); > > > + cp_lexer_consume_token (parser->lexer); > > > + next_token = cp_lexer_peek_token (parser->lexer); > > > + saved = true; > > > + } > > > > There's no ambiguity in the case of a destructor, as you note in your > > comments in cp_parser_id_expression. How about returning early if we see ~? > > > > > + /* Find offset of final ">". */ > > > + for (; depth > 0; i++) > > > + { > > > + switch (cp_lexer_peek_nth_token (parser->lexer, i)->type) > > > + { > > > + case CPP_LESS: > > > + depth++; > > > + break; > > > + case CPP_GREATER: > > > + depth--; > > > + break; > > > + case CPP_RSHIFT: > > > + depth-=2; > > > + break; > > > + case CPP_EOF: > > > + case CPP_SEMICOLON: > > > + goto no; > > > + default: > > > + break; > > > + } > > > + } > > > > This code doesn't handle skipping matched ()/{}/[] in the > > template-argument-list. You probably want to involve > > cp_parser_skip_to_end_of_template_parameter_list somehow. > > > > > +no: > > > + if (saved) > > > + cp_lexer_rollback_tokens (parser->lexer); > > > + return 2; > > > + > > > +yes: > > > + if (saved) > > > + cp_lexer_rollback_tokens (parser->lexer); > > > + return 1; > > > > Now that we're writing C++, I'd prefer to avoid this kind of pattern in > > favor of RAII, such as saved_token_sentinel. If this is still relevant > > after addressing the above comments. > > > > > + If DEPENDENT_EXPRESSION_P is true, this is a dependent id-expression > > > + that is not the current instantiation. */ > > > > As mentioned above, let's not add this parameter. > > > > > + error_at (next_token->location, > > > + "expected \"template\" keyword before dependent template " > > > + "name"); > > > > As mentioned above, this should be a warning. > > > > > - /* If it worked, we're done. */ > > > - if (cp_parser_parse_definitely (parser)) > > > - return id; > > > + if (looks_like_template != 2) > > > + { > > > + /* We don't know yet whether or not this will be a > > > + template-id. */ > > > > The indentation here is off. > > > > > + int looks_like_template) > > > > Let's give these parms a default argument of 0. And call them > > looks_like_template_id to be clearer. > > > > > - /* Avoid performing name lookup if there is no possibility of > > > - finding a template-id. */ > > > - if ((token->type != CPP_NAME && token->keyword != RID_OPERATOR) > > > - || (token->type == CPP_NAME > > > - && !cp_parser_nth_token_starts_template_argument_list_p > > > - (parser, 2))) > > > + /* Only if we have not already checked whether this looks like a > > > + template. */ > > > + if (looks_like_template == 0) > > > { > > > - cp_parser_error (parser, "expected template-id"); > > > - return error_mark_node; > > > + /* Avoid performing name lookup if there is no possibility of > > > + finding a template-id. */ > > > > You could add the looks_like_template_id check to the existing condition > > so we don't need to reindent the comment or body. > > > > > +++ b/gcc/symbol-summary.h > > > @@ -191,7 +191,7 @@ public: > > > template<typename Arg, bool (*f)(const T &, Arg)> > > > void traverse (Arg a) const > > > { > > > - m_map.traverse <f> (a); > > > + m_map.template traverse <f> (a); > > > > I've gone ahead and applied this fix as a separate patch. > > > > Jason > >
pr70417_5.patch
Description: Binary data