Re: [PATCH v4 5/8] Start adding the infrastructure for handling TypoExprs.

David Blaikie Mon, 27 Oct 2014 10:45:48 -0700

Looks good, please commit whenever you're ready.

I think it might be nice to explore modelling the Consumer as a range
exposing iterators, etc rather than having internal state
(getNextCorrection+finished+resetCorrectionStream), leveraging existing C++
idioms like this may make the code a bit easier to understand at first
glance (maybe, anyway - it'd also provide the usual benefits of the
iterator pattern (separation of concerns between the sequence and its
iteration, possibility of multiple active iterations, etc) but I don't know
that any of those would be particularly important/useful in this particular
situation at the moment)


On Mon, Oct 27, 2014 at 9:57 AM, Kaelyn Takata <[email protected]> wrote:

> Here's the latest version of just this patch. The new test case is in
> patch 8/8.
>
> On Thu, Oct 23, 2014 at 11:23 AM, David Blaikie <[email protected]>
> wrote:
>
>>
>>
>> On Thu, Oct 23, 2014 at 11:14 AM, Kaelyn Takata <[email protected]> wrote:
>>
>>> Well, I've found a test case that would trigger the assertion but in the
>>> code's current form does the right thing:
>>>
>>
>> Great!
>>
>>
>>>
>>> class Foo {
>>> public:
>>>   template <> void testIt();
>>>   void textIt();
>>> };
>>>
>>> void test(Foo *f) {
>>>   f->TestIt();
>>> }
>>>
>>>
>>> With the patchset in it's current incarnation (including the fix I
>>> mentioned in my last email):
>>>
>>
>> I didn't quite follow the description of the other fix/issue there - few
>> too many patches in my head. Could you send the latest version of this
>> patch under review?
>>
>>
>>>
>>> fail.cc:3:20: error: no function template matches function template
>>> specialization 'testIt'
>>>   template <> void testIt();
>>>                    ^
>>> fail.cc:8:6: error: no member named 'TestIt' in 'Foo'; did you mean
>>> 'textIt'?
>>>   f->TestIt();
>>>      ^~~~~~
>>>      textIt
>>> fail.cc:4:8: note: 'textIt' declared here
>>>   void textIt();
>>>        ^
>>> 2 errors generated.
>>>
>>>
>>> With the "if (!NE.isInvalid())" as an assertion:
>>>
>>> fail.cc:3:20: error: no function template matches function template
>>> specialization 'testIt'
>>>   template <> void testIt();
>>>                    ^
>>> clang-3.6: ../tools/clang/lib/Sema/SemaExprCXX.cpp:6055: ExprResult
>>> (anonymous namespace)::TransformTypos::TransformTypoExpr(clang::TypoExpr
>>> *): Assertion `!NE.isInvalid()' failed.
>>>
>>>
>>> Note that the (invalid) function template "testIt" has an edit distance
>>> of 1 from "TestIt" and the proper suggestion, "textIt", has an edit
>>> distance of 2. I've added the above test case to
>>> typo-correction-delayed.cpp.
>>>
>>>
>>> On Thu, Oct 23, 2014 at 11:05 AM, David Blaikie <[email protected]>
>>> wrote:
>>>
>>>> (sorry for the repetition - I'm failing at 'reply all' this morning)
>>>>
>>>> On Thu, Oct 23, 2014 at 10:53 AM, Kaelyn Takata <[email protected]>
>>>> wrote:
>>>>
>>>>> On Wed, Oct 22, 2014 at 2:19 PM, David Blaikie <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 21, 2014 at 1:37 PM, Kaelyn Takata <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Oct 17, 2014 at 9:52 AM, David Blaikie <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Oct 9, 2014 at 10:12 AM, David Blaikie <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Oct 9, 2014 at 9:58 AM, Kaelyn Takata <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Oct 7, 2014 at 5:03 PM, David Blaikie <[email protected]
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Oct 7, 2014 at 2:29 PM, Kaelyn Takata <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Oct 7, 2014 at 11:43 AM, David Blaikie <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Oct 6, 2014 at 3:46 PM, Kaelyn Takata <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Oct 3, 2014 at 11:45 AM, David Blaikie <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Given that these are parts of a patch series, some of which
>>>>>>>>>>>>>>> have already been signed off on/I haven't looked at, it's a bit 
>>>>>>>>>>>>>>> hard to
>>>>>>>>>>>>>>> review the overall design - but I believe you & Richard talked 
>>>>>>>>>>>>>>> that over,
>>>>>>>>>>>>>>> so I'm not too worried about that (just explaining why my 
>>>>>>>>>>>>>>> review feedback's
>>>>>>>>>>>>>>> going to be a bit narrow)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +  if (NumTypos > 0 && !ExprEvalContexts.empty())
>>>>>>>>>>>>>>> +    ExprEvalContexts.back().NumTypos += NumTypos;
>>>>>>>>>>>>>>> +  else
>>>>>>>>>>>>>>> +    assert(NumTypos == 0 && "There are outstanding typos after 
>>>>>>>>>>>>>>> popping the "
>>>>>>>>>>>>>>> +                            "last 
>>>>>>>>>>>>>>> ExpressionEvaluationContextRecord");
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Could this be simplified? I think it's equivalent to:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   if (!ExprEvalContexts.empty())
>>>>>>>>>>>>>>>     ExprEvalContexts.back().NumTypos += NumTypos;
>>>>>>>>>>>>>>>   else
>>>>>>>>>>>>>>>     assert(NumTypos == 0 ... );
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> or:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   assert(ExprEvalContexts.empty() || NumTypos != 0 && ...)
>>>>>>>>>>>>>>>   if (!ExprEvalContexts.empty())
>>>>>>>>>>>>>>>     ...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes, the "NumTypos > 0" was unnecessary and has been
>>>>>>>>>>>>>> removed. (FYI, the second form you suggested doesn't quite work 
>>>>>>>>>>>>>> because the
>>>>>>>>>>>>>> assert would fail on the common case where ExprEvalContexts is 
>>>>>>>>>>>>>> not empty
>>>>>>>>>>>>>> and there are no typos.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hmm, right, maybe:
>>>>>>>>>>>>>
>>>>>>>>>>>>>   assert(ExprEvalContexts.empty() ^ NumTypos != 0 && ... )
>>>>>>>>>>>>>   if (!ExprEvalContexts.empty())
>>>>>>>>>>>>>     ExprEvalContexts.back().NumTypos += NumTypos;
>>>>>>>>>>>>>
>>>>>>>>>>>>> but perhaps that's just too confusing. I just find an
>>>>>>>>>>>>> assertion as the only thing inside a block of code to be a bit 
>>>>>>>>>>>>> off... but
>>>>>>>>>>>>> it's certainly not unprecedented in LLVM and is probably more 
>>>>>>>>>>>>> legible -
>>>>>>>>>>>>> just my own hangup to get over :)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> There's an unnecessary semicolon at the end of the "while 
>>>>>>>>>>>>>> (true)" loop in TransformTypos::Transform
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Removed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'd roll this up a bit:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +      auto &State = SemaRef.getTypoExprState(TE);
>>>>>>>>>>>>>>> +      if (State.DiagHandler)
>>>>>>>>>>>>>>> +        
>>>>>>>>>>>>>>> (*State.DiagHandler)(State.Consumer->getCurrentCorrection());
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> into:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   if (auto *Handler = SemaRef.getTypoExprState(TE).DiagHandler)
>>>>>>>>>>>>>>>     (*DiagHandler)(State.Consumer->getCurrentCorrection());
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That rollup doesn't work because State would be unknown when
>>>>>>>>>>>>>> accessing State.Consumer in the argument to the DiagHandler.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Oh, right!
>>>>>>>>>>>>>
>>>>>>>>>>>>>> count in the condition, plus insert in the body of the if in 
>>>>>>>>>>>>>> TransformTypos::TransformTypoExpr would be more efficient as a 
>>>>>>>>>>>>>> single insert-and-check (by avoiding two separate lookups into 
>>>>>>>>>>>>>> the set):
>>>>>>>>>>>>>>>   if (TypoExprs.insert(E).second) // means this element 
>>>>>>>>>>>>>>> previously didn't exist in the set, and now does
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've changed the count-then-insert to just an insert since
>>>>>>>>>>>>>> SmallPtrSetImpl::insert returns true when the item wasn't 
>>>>>>>>>>>>>> already in the
>>>>>>>>>>>>>> set.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   TransformCache.find(E) != TransformCache.end()
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> might commonly be written as:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   TransformCache.count(E)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Optionally with "!= 0", but often count is used directly as a 
>>>>>>>>>>>>>>> boolean test, especially when it's a single set/map (not a 
>>>>>>>>>>>>>>> multi set/map).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Changed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This comment didn't quite make sense to me:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +      // If corrections for the first TypoExpr have been 
>>>>>>>>>>>>>>> exhausted, retry them
>>>>>>>>>>>>>>> +      // against the next combination of substitutions for all 
>>>>>>>>>>>>>>> of the other
>>>>>>>>>>>>>>> +      // TypoExprs.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When you say "retry them" which them do you mean? The 
>>>>>>>>>>>>>>> corrections for the first TypoExpr? But what does it mean for 
>>>>>>>>>>>>>>> them to be exhausted then, if they can still be considered?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Maybe I need a bit of help with the higher level algorithm here 
>>>>>>>>>>>>>>> - I assume the idea is:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Call transform.
>>>>>>>>>>>>>>> As it visits TypoExpr nodes, it tries their first typo 
>>>>>>>>>>>>>>> candidate available.
>>>>>>>>>>>>>>> If we get to the end and have invalid code, reset the state of 
>>>>>>>>>>>>>>> all the typo corrections except the first one (on its next 
>>>>>>>>>>>>>>> query, it'll give back its second alternative).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here is where the confusion is at. As the transform visits
>>>>>>>>>>>>>> each TypoExpr node it tries the first candidate available. If 
>>>>>>>>>>>>>> the first
>>>>>>>>>>>>>> candidates of all the TypoExprs don't work, it then tries the 
>>>>>>>>>>>>>> next (and
>>>>>>>>>>>>>> subsequent) candidates of only the first TypoExpr that was seen. 
>>>>>>>>>>>>>> When it
>>>>>>>>>>>>>> exhausts the candidates from the first TypoExpr, it then resets 
>>>>>>>>>>>>>> the stream
>>>>>>>>>>>>>> of candidates for the first TypoExpr and fetches the next 
>>>>>>>>>>>>>> candidate of the
>>>>>>>>>>>>>> TypoExpr, continuing in this way until all combinations of 
>>>>>>>>>>>>>> candidates of
>>>>>>>>>>>>>> all TypoExprs have been tried.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So I'm still a bit confused about which parts of the retry
>>>>>>>>>>>>> logic are in which parts of the code.
>>>>>>>>>>>>>
>>>>>>>>>>>>> OK, I think I've got it. Transform is called and we don't even
>>>>>>>>>>>>> know which TypoExprs are in the Expr. So we transform.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As transform runs it comes across TypoExprs, and attempts to
>>>>>>>>>>>>> transform them. For each one, it tries candidates until it finds 
>>>>>>>>>>>>> a good one
>>>>>>>>>>>>> (based on the immediate context - what are the cases where
>>>>>>>>>>>>> BuildDeclarationNameExpr fails here? (when does NE.isInvalid 
>>>>>>>>>>>>> fail?)).
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I think there is a bit of confusion on what a "retry" is.
>>>>>>>>>>>> Within TransformTypoExpr, for either the first TypoExpr ever 
>>>>>>>>>>>> encountered by
>>>>>>>>>>>> this TreeTransform or for a TypoExpr without a cached 
>>>>>>>>>>>> substitution, it
>>>>>>>>>>>> tries to build a valid subexpression from the TypoCorrection 
>>>>>>>>>>>> candidates
>>>>>>>>>>>> (e.g. a DeclRefExpr) and continues trying candidates until it 
>>>>>>>>>>>> finds one for
>>>>>>>>>>>> which a valid subexpression can be formed, which it caches and 
>>>>>>>>>>>> returns. If
>>>>>>>>>>>> none of the candidates worked, it will cache and return an invalid
>>>>>>>>>>>> ExprResult. The TreeTransform then takes the returned 
>>>>>>>>>>>> subexpression and
>>>>>>>>>>>> tries to rebuild the full expression (possibly trying to transform 
>>>>>>>>>>>> multiple
>>>>>>>>>>>> TypoExprs along with CallExprs, etc).
>>>>>>>>>>>>
>>>>>>>>>>>> I don't remember off the top of my head without looking at the
>>>>>>>>>>>> body, but I think BuildDeclarationNameExpr will fail in cases 
>>>>>>>>>>>> where the
>>>>>>>>>>>> correction is e.g. a type name, or a member reference 
>>>>>>>>>>>> name--basically,
>>>>>>>>>>>> BuildDeclarationNameExpr returns an ExprResult and the correct 
>>>>>>>>>>>> behavior in
>>>>>>>>>>>> principle is to check for an invalid result and try the next 
>>>>>>>>>>>> correction
>>>>>>>>>>>> candidate when one is seen--even if the Consumer never returns a 
>>>>>>>>>>>> candidate
>>>>>>>>>>>> for which BuildDeclarationNameExpr would fail.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, in your example below, maybe the right answer is BEH, but
>>>>>>>>>>>>> AF is valid, with no correct candidate for the 3 choice.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> No, AF with no correction for the 3rd TypoExpr would result in
>>>>>>>>>>>> the entire Expr result being invalid within the main Transform 
>>>>>>>>>>>> method.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Consumers start at "before the first" element, calling
>>>>>>>>>>>>> "getNextCorrection" on a new consumer gives you the first 
>>>>>>>>>>>>> correction.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> That is correct. And the "before the first" element is also an
>>>>>>>>>>>> empty correction for use with getCurrentCorrection. ;)
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On the first call to TransformExpr, the first call to
>>>>>>>>>>>>> TransformTypoExpr chooses A, records it, continues.
>>>>>>>>>>>>> Gets to the second TransformTypoExpr, tries D and E, succeeds
>>>>>>>>>>>>> at F.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> it would only try D then E then F within a single call to
>>>>>>>>>>>> TransformTypoExpr if BuildDeclarationNameExpr failed for D and E.
>>>>>>>>>>>>
>>>>>>>>>>>>> On the third TransformTypoExpr, it tries H and I, both fail,
>>>>>>>>>>>>> so it returns from the last line ("return TransformCache[E] = 
>>>>>>>>>>>>> ExprError]).
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Same case here--that last line would only be reached if H and I
>>>>>>>>>>>> both failed within the call to BuildDeclarationNameExpr.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> We get back to Transform with a failure. So we walk the
>>>>>>>>>>>>> TypoExprs (in CheckAndAdvanceTypoExprCorrectionStreams) we reset 
>>>>>>>>>>>>> the 3rd
>>>>>>>>>>>>> consumer (because it was finished).
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> No, CheckAndAdvanceTypoExprCorrectionStreams would do nothing
>>>>>>>>>>>> if the first TypoExpr is not finished. If the first TypoExpr is 
>>>>>>>>>>>> finished,
>>>>>>>>>>>> then it would reset it and clear the cached result for the second 
>>>>>>>>>>>> TypoExpr
>>>>>>>>>>>> (so that the next call to TransformTypoExpr on the second TypoExpr 
>>>>>>>>>>>> would
>>>>>>>>>>>> try a new candidate instead of using the cached result). Only if 
>>>>>>>>>>>> both the
>>>>>>>>>>>> first and second TypoExprs were *both* finished would the cached 
>>>>>>>>>>>> value of
>>>>>>>>>>>> the third TypoExpr be cleared (and the candidate streams for both 
>>>>>>>>>>>> the first
>>>>>>>>>>>> and second TypoExprs would be reset in addition to their cached 
>>>>>>>>>>>> results
>>>>>>>>>>>> being removed from the cache). The only way
>>>>>>>>>>>> CheckAndAdvanceTypoExprCorrectionStreams would see if the third 
>>>>>>>>>>>> TypoExpr
>>>>>>>>>>>> was finished would be if both the first and second TypoExprs were 
>>>>>>>>>>>> also
>>>>>>>>>>>> finished, and upon resetting it the tree transform would only be 
>>>>>>>>>>>> attempted
>>>>>>>>>>>> again if there were at least four known TypoExprs (i.e. the number 
>>>>>>>>>>>> of known
>>>>>>>>>>>> TypoExprs is greater than the number of TypoExprs that were 
>>>>>>>>>>>> finished and
>>>>>>>>>>>> had to be reset by CheckAndAdvanceTypoExprCorrectionStreams).
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> We transform again:
>>>>>>>>>>>>>
>>>>>>>>>>>>> First typo) We are at A so we call next and get B
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes. The first TypoExpr would always have it's next candidate
>>>>>>>>>>>> tried by a call to TransformTypoExpr.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Second typo) From F we continue and try G, fail, then we're
>>>>>>>>>>>>> finished for the second typo.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> No, G would only be tried if the result from transforming F was
>>>>>>>>>>>> removed from the cache.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Third typo) Perhaps we find some solution for this, or not, it
>>>>>>>>>>>>> doesn't matter.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We get back to Transform with a failure (second typo), So we
>>>>>>>>>>>>> walk the TypoExprs again, resetting the second (and maybe the 
>>>>>>>>>>>>> third).
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> If there are only three TypoExprs, then resetting the third
>>>>>>>>>>>> TypoExpr would mean there are no valid combinations of corrections 
>>>>>>>>>>>> that
>>>>>>>>>>>> yield a valid expression (where the final result of the transform 
>>>>>>>>>>>> is
>>>>>>>>>>>> neither an invalid ExprResult nor caused any errors to be reported 
>>>>>>>>>>>> by the
>>>>>>>>>>>> SFINAETrap).
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> First typo) We are at B so we call next and get to C, let's
>>>>>>>>>>>>> say we fail here - C isn't even remotely valid.
>>>>>>>>>>>>> Now it really matters whether we find/fail on the second and
>>>>>>>>>>>>> third typo. If we fail on the second and third... we're going to 
>>>>>>>>>>>>> give up.
>>>>>>>>>>>>> We'll find that all the corrections are finished and we'll 
>>>>>>>>>>>>> conclude that
>>>>>>>>>>>>> we've tried everything.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am I following this right?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Not quite. I think you are forgetting about the fact that, for
>>>>>>>>>>>> all TypoExprs except the first, a cached result is used if it 
>>>>>>>>>>>> exists
>>>>>>>>>>>> instead of fetching a new candidate from the Consumer and trying 
>>>>>>>>>>>> to build a
>>>>>>>>>>>> valid Expr with it.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Ah, looking at the code again I see what you mean. The 'break'
>>>>>>>>>>> is the bit that I missed in the reset/advance loop.
>>>>>>>>>>>
>>>>>>>>>>> Which patches would I have to apply to ToT to be able to
>>>>>>>>>>> exercise this code? It might help for me to step through some of 
>>>>>>>>>>> these
>>>>>>>>>>> scenarios in a debugger.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To exercise this code, you'd need the full patch series (all 8)
>>>>>>>>>> to exercise this code because it's not until patch #8 that anything
>>>>>>>>>> actually generates TypoExprs in the AST. I can send you a combined 
>>>>>>>>>> patch
>>>>>>>>>> rebased on ToT off-list if you'd like (it's ~112KB uncompressed), 
>>>>>>>>>> and the
>>>>>>>>>> new test cases in test/SemaCXX/typo-correction-delayed.cpp 
>>>>>>>>>> explicitly test
>>>>>>>>>> this functionality.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That'd be great!
>>>>>>>>>
>>>>>>>>
>>>>>>>> With your total patch applied I've been able to reproduce a few
>>>>>>>> typo correction scenarios - thanks!
>>>>>>>>
>>>>>>>> But I can't seem to reach/exercise the loop in
>>>>>>>> TransformTypos::TransformTypoExpr - if I change the "if 
>>>>>>>> (!NE.isInvalid())"
>>>>>>>> condition at the end of the loop (the only way to repeat the body of 
>>>>>>>> the
>>>>>>>> loop) to "assert(!NE.isInvalid())" it doesn't fire on any of the test 
>>>>>>>> cases
>>>>>>>> (nor on my own modified versions). Do you have some tests that cover 
>>>>>>>> this?
>>>>>>>> Could you show/describe a simple situation where it occurs so I can 
>>>>>>>> play
>>>>>>>> around with that part of the logic?
>>>>>>>>
>>>>>>>
>>>>>>> I'm not sure it can be triggered with CorrectTypoDelayed only hooked
>>>>>>> up to LookupMemberExprInRecord (a static function called by
>>>>>>> Sema::BuildMemberReferenceExpr directly and indirectly via
>>>>>>> LookupMemberExpr), but in the second patch set which hooks
>>>>>>> CorrectTypoDelayed up to DiagnoseEmptyLookup, the assert is triggered by
>>>>>>> the tests SemaCXX/type-definition-in-specifier.cpp
>>>>>>> and SemaCXX/warn-thread-safety-parsing.cpp (in addition to the loop 
>>>>>>> being
>>>>>>> the correct/principled behavior since there is no guarantee that the
>>>>>>> recovery callback will generate a valid ExprResult from a non-empty
>>>>>>> TypoCorrection). For this patch set, to trigger the assert a code 
>>>>>>> snippet
>>>>>>> would have to correct a class member lookup and have a candidate that 
>>>>>>> is a
>>>>>>> member function, constant, variable, or template in the current class or
>>>>>>> one of its base classes (per the RecordMemberExprValidatorCCC) yet which
>>>>>>> yields causes MemberExprTypoRecovery to return an ExprError when trying 
>>>>>>> to
>>>>>>> build the member reference. Basically, it would require having the typo
>>>>>>> correction try a candidate that is an accessible class member that 
>>>>>>> cannot
>>>>>>> be reference--something that I'm not sure is even possible in C++.
>>>>>>>
>>>>>>
>>>>>> Hrm, OK - could you change it to an assertion (& remove the outer
>>>>>> loop/replace it with a conditional) until that later patch where it 
>>>>>> becomes
>>>>>> necessary, so it'll be easier to review it in the context of it being
>>>>>> executed/used code?
>>>>>>
>>>>>
>>>>> I understand your suggestion, however I don't think changing the code
>>>>> to do the wrong thing and assert if a false assumption is violated is the
>>>>> right thing to do. If the only source of an ExprResult in the loop was the
>>>>> call to Sema::BuildDeclarationNameExpr, I'd be a bit more willing to do
>>>>> so... but if a recovery handler callback is provided and invoked, there's
>>>>> no guarantee it won't return an invalid ExprResult, and no documentation 
>>>>> of
>>>>> the assumption that the callback must return a valid ExprResult (outside 
>>>>> of
>>>>> the one assertion you are suggesting, buried within the TypoExpr handling
>>>>> machinery). Additionally, changing the code to do the wrong thing by
>>>>> changing the "while (...)" to "if (...)" and replacing "if
>>>>> (!NE.isInvalid())" with a bogus assertion does nothing to simplify the
>>>>> actual code block... particularly since it would immediately be changed
>>>>> right back to the way it is now in the next patch set (a version of which 
>>>>> I
>>>>> had already mailed out two months ago).
>>>>>
>>>>
>>>> My concern is that this is untested (& actually untestable/dead in this
>>>> revision) code - if someone were to sync to this revision and try to
>>>> understand what the code did, why it was there, and how it should be
>>>> modified to do other things, they'd be confused & probably get it wrong
>>>> (because there are no tests to demonstrate the motivation). It's nice for
>>>> code to go along with the use - the usual rules of providing tests along
>>>> with code.
>>>>
>>>> But I'll muddle through & try to remember to figure out the tests & the
>>>> design when you commit/review the next patches if necessary.
>>>>
>>>>
>>>>>
>>>>> (FYI, I also just noticed I had a mistake in the full patch I sent
>>>>> you... specifically I seem to have forgotten to delete a return statement
>>>>> when I changed the code to look up the TypoExpr in the TransformCache only
>>>>> one and assign the result to the created entry and rebased the rest of the
>>>>> patch set:
>>>>>
>>>>> diff --git a/lib/Sema/SemaExprCXX.cpp b/lib/Sema/SemaExprCXX.cpp
>>>>> index 81b3b01..9e05e82 100644
>>>>> --- a/lib/Sema/SemaExprCXX.cpp
>>>>> +++ b/lib/Sema/SemaExprCXX.cpp
>>>>> @@ -6049,8 +6049,6 @@ public:
>>>>>          if (!TC.isKeyword())
>>>>>            R.addDecl(TC.getCorrectionDecl());
>>>>>          NE = SemaRef.BuildDeclarationNameExpr(CXXScopeSpec(), R,
>>>>> false);
>>>>> -        return TransformCache[E] =
>>>>> -                   SemaRef.BuildDeclarationNameExpr(CXXScopeSpec(),
>>>>> R, false);
>>>>>        }
>>>>>        assert(!NE.isUnset() &&
>>>>>               "Typo was transformed into a valid-but-null ExprResult");
>>>>>
>>>>> So the "if (!NE.isUnset())" is intended to be the only early return
>>>>> within the loop, though it still does not fail as an assert within the
>>>>> patchset.)
>>>>>
>>>>>>
>>>>>> Without that wrinkle, I think I've followed the code enough to sign
>>>>>> off on it. Please commit.
>>>>>>
>>>>>> [I do find the way the algorithm visits the corrections to be a
>>>>>> little hard to follow, in that its split between the TypoExprs cache,
>>>>>> consumer state, TransformTypoExpr, and
>>>>>> CheckAndAdvanceTypoExprCorrectionStreams - in some very abstract theory I
>>>>>> would like having each sequence of typo candidates (the Consumers 
>>>>>> represent
>>>>>> this) expose an iterator pair, then a generic utility to do combinations 
>>>>>> of
>>>>>> a list of ranges and it would expose an iterator over those combinations
>>>>>> which you could just increment to get the next one - it'd separate out 
>>>>>> the
>>>>>> combination logic into a reusable component that could be easily
>>>>>> examined/tested, then composed with the typo correction stuff to keep the
>>>>>> logic there relatively simple/clearer.
>>>>>>
>>>>>> I might try to work up an example of what I mean]
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>> - David
>>>>>>>>
>>>>>>>> (micro review comment I saw along the way: I /think/ 
>>>>>>>> CheckAndAdvanceTypoExprCorrectionStreams
>>>>>>>> could be simplified as follows:
>>>>>>>>
>>>>>>>
>>>>>>> You're right, it can be simplified now that the loop is in its own
>>>>>>> function. Changed.
>>>>>>>
>>>>>>>>
>>>>>>>>   bool CheckAndAdvanceTypoExprCorrectionStreams() {
>>>>>>>>     for (auto TE : TypoExprs) {
>>>>>>>>       TransformCache.erase(TE);
>>>>>>>>       auto &State = SemaRef.getTypoExprState(TE);
>>>>>>>>       if (!State.Consumer->finished())
>>>>>>>>         return true;
>>>>>>>>       State.Consumer->resetCorrectionStream();
>>>>>>>>     }
>>>>>>>>     return false;
>>>>>>>>   }
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> (I'm sort of wondering if this might be better expressed by some
>>>>>>>>>>> iterator scheme, but I'm not sure - there's no pretty 
>>>>>>>>>>> next_combination
>>>>>>>>>>> algorithm (someone proposed one to boost at some point, it seems, 
>>>>>>>>>>> but
>>>>>>>>>>> that's about as close as it's got to anything common/generic) 
>>>>>>>>>>> anyway)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> So for 3 TypoExprs... let's say 1 has the candidates (A, B,
>>>>>>>>>>>>>> C), 2 has the candidates (D, E, F, G) and 3 has the candidates 
>>>>>>>>>>>>>> (H, I) the
>>>>>>>>>>>>>> iterations would test the combinations in order:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> A D H
>>>>>>>>>>>>>> B D H
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I suppose a simpler form of my above example is: if, on the
>>>>>>>>>>>>> first pass, we get ADH, on the next pass the code as-is would 
>>>>>>>>>>>>> result in
>>>>>>>>>>>>> BCI, wouldn't it? It'll "next" each of the corrections.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> (I think you meant BEI, and C is only a candidate of the first
>>>>>>>>>>>> TypoExpr, not the second one.) No it won't because of the caching. 
>>>>>>>>>>>> The
>>>>>>>>>>>> second TypoExpr is only effectively "next"ed once all of the 
>>>>>>>>>>>> candidates for
>>>>>>>>>>>> the first TypoExpr have been tried with the current candidate of 
>>>>>>>>>>>> the second
>>>>>>>>>>>> TypoExpr, and the third TypoExpr would only be "next"ed if all of 
>>>>>>>>>>>> the
>>>>>>>>>>>> candidates of both the first and second TypoExpr have been tried 
>>>>>>>>>>>> with the
>>>>>>>>>>>> current candidate of the third TypoExpr.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> And I'm still not quite understanding why there's the  E !=
>>>>>>>>>>>>> TypoExprs[0] condition... maybe that's part of what I'm missing. 
>>>>>>>>>>>>> It looks
>>>>>>>>>>>>> like that code would cause, in a single Transform, multiple 
>>>>>>>>>>>>> instances of
>>>>>>>>>>>>> the first typo to be computed with different/subsequent typo 
>>>>>>>>>>>>> results...
>>>>>>>>>>>>> which seems confusing/strange. (though I don't fully understand 
>>>>>>>>>>>>> how/where
>>>>>>>>>>>>> multiple instances of the same TypoExpr will be visited in a 
>>>>>>>>>>>>> single
>>>>>>>>>>>>> Transform)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> That condition ensures that the cached result is never used for
>>>>>>>>>>>> the first TypoExpr (instead of having an extra conditional for all 
>>>>>>>>>>>> of the
>>>>>>>>>>>> return statements below here to not cache the result for the first
>>>>>>>>>>>> TypoExpr).
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> (some nitpicks of the code - there's a duplicate lookup of
>>>>>>>>>>>>> TransformCache (count then [], or count then insert further down 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> algorithm - it should probably just be:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ExprResult &CacheEntry = TransformCache[E];
>>>>>>>>>>>>> if (CacheEntry)
>>>>>>>>>>>>>   return CacheEntry;
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This wouldn't work because "TransformCache[E]" would add a
>>>>>>>>>>>> cache entry for E, and either the "if (CacheEntry)" would always 
>>>>>>>>>>>> be true,
>>>>>>>>>>>> or an ExprError() would be considered false and an ExprError cache 
>>>>>>>>>>>> entry
>>>>>>>>>>>> would always be ignored, both of which would be bad.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> you're right that ExprResult isn't boolean testable - but it is
>>>>>>>>>>> possible to distinguish ExprError from a default-constructed 
>>>>>>>>>>> ExprResult. A
>>>>>>>>>>> default constructed ExprResult is in the valid-but-unset state, so 
>>>>>>>>>>> far as I
>>>>>>>>>>> can tell, so it would look like:
>>>>>>>>>>>
>>>>>>>>>>> if (!CacheEntry.isUnset())
>>>>>>>>>>>   return CacheEntry
>>>>>>>>>>>
>>>>>>>>>>> (which is, admittedly, not the /most/ legible thing ever)
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'm worried this won't work, or at least would be (potentially)
>>>>>>>>>> fragile, because .isUnset() is defined as "valid-but-null" and Sema 
>>>>>>>>>> sets
>>>>>>>>>> valid-but-null as an explicit state in the return values of various
>>>>>>>>>> functions--at least a few of which I saw while looking at users of 
>>>>>>>>>> the typo
>>>>>>>>>> correction code.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Ooh... I could totally see that happening (& thanks for the
>>>>>>>>> explanation).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Of course, a new local variable could be created containing the
>>>>>>>>>>>> result of TransformCache.find(E) and then compared to 
>>>>>>>>>>>> TransformCache.end()
>>>>>>>>>>>> and returning as appropriate, but I find it to be more verbose and 
>>>>>>>>>>>> less
>>>>>>>>>>>> clear:
>>>>>>>>>>>>
>>>>>>>>>>>> auto CacheEntry = TransformCache.find(E);
>>>>>>>>>>>> if (!TypoExprs.insert(E) && CacheEntry != TransformCache.end())
>>>>>>>>>>>>   return CacheEntry->second;
>>>>>>>>>>>>
>>>>>>>>>>>> And neither of the subsequent return statements would be
>>>>>>>>>>>> changed because the cache entry would still need to be created...
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> If you were using the iterator version, they could be hinted
>>>>>>>>>>> insertions which might be better - but I agree, if you're just 
>>>>>>>>>>> using the
>>>>>>>>>>> iterators it's probably not terribly helpful. But as shown above, 
>>>>>>>>>>> you can
>>>>>>>>>>> actually just use the op[] at the start, check !isUnset, then 
>>>>>>>>>>> initialize
>>>>>>>>>>> the elements. This is a fairly common idiom in map related code, 
>>>>>>>>>>> though
>>>>>>>>>>> more often it looks like
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> After a bit more consideration, I've taken your suggested
>>>>>>>>>> approach of using .isUnset() and added an assertion on trying to 
>>>>>>>>>> cache an
>>>>>>>>>> "unset" value ("assert(!NE.isUnset() && ...);").
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sounds like a fair tradeoff - if the assertion does fire we can
>>>>>>>>> add a test case for it & switch back to something like your original
>>>>>>>>> formulation.
>>>>>>>>>
>>>>>>>>> The other gotcha to be aware of is that since it's a reference
>>>>>>>>> into a DenseMap are invalidated by insertions - so if any of the work 
>>>>>>>>> to
>>>>>>>>> initialize the TransformCache (the while nextCorrection loop, and 
>>>>>>>>> probably
>>>>>>>>> most importantly the "BuildDeclarationNameExpr" call) can also modify 
>>>>>>>>> the
>>>>>>>>> TransformCache (by transforming another typo expr) then the reference 
>>>>>>>>> will
>>>>>>>>> be invalidated and the approach I'd suggested won't be applicable (you
>>>>>>>>> could still keep just one lookup in the cache hit case, but in the 
>>>>>>>>> cache
>>>>>>>>> miss you'd have to do a second lookup after the work)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ...
>>>>>>>>>>>>>     return CacheEntry = NE;
>>>>>>>>>>>>> }
>>>>>>>>>>>>> return CacheEntry = ExprError();
>>>>>>>>>>>>>
>>>>>>>>>>>>> )
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> C D H
>>>>>>>>>>>>>> A E H
>>>>>>>>>>>>>> B E H
>>>>>>>>>>>>>> C E H
>>>>>>>>>>>>>> A F H
>>>>>>>>>>>>>> B F H
>>>>>>>>>>>>>> C F H
>>>>>>>>>>>>>> A G H
>>>>>>>>>>>>>> B G H
>>>>>>>>>>>>>> C G H
>>>>>>>>>>>>>> A D I
>>>>>>>>>>>>>> B D I
>>>>>>>>>>>>>> C D I
>>>>>>>>>>>>>> A E I
>>>>>>>>>>>>>> B E I
>>>>>>>>>>>>>> C E I
>>>>>>>>>>>>>> A F I
>>>>>>>>>>>>>> B F I
>>>>>>>>>>>>>> C F I
>>>>>>>>>>>>>> A G I
>>>>>>>>>>>>>> B G I
>>>>>>>>>>>>>> C G I
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If we get to the end, have invalid code, and the first typo has 
>>>>>>>>>>>>>>> no alternatives left to try - I guess we just want to print 
>>>>>>>>>>>>>>> errors for everything, without typo corrections?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If all of the combinations of typo corrections failed to
>>>>>>>>>>>>>> create an acceptable Expr, then the current correction for all 
>>>>>>>>>>>>>> of TypoExprs
>>>>>>>>>>>>>> would be an empty TypoCorrection which would cause the 
>>>>>>>>>>>>>> corresponding
>>>>>>>>>>>>>> diagnostic handlers to print out errors without suggestions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It might be helpful to split out some of these chunks (the 
>>>>>>>>>>>>>>> state resetting code and maybe the diagnostic printing at the 
>>>>>>>>>>>>>>> end) into separate functions to give them logical names & make 
>>>>>>>>>>>>>>> the higher level algorithm more clear?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've split the diagnostics emission and the state handling
>>>>>>>>>>>>>> into separate helpers, and expanded their description comments. 
>>>>>>>>>>>>>> I'm not
>>>>>>>>>>>>>> keen on the name for the state handler but it is the best name I 
>>>>>>>>>>>>>> could come
>>>>>>>>>>>>>> up with that was even remotely descriptive.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Roughly works, though I'm still quite confused, it seems.
>>>>>>>>>>>>> (alternative name for that function would be "reset" rather than 
>>>>>>>>>>>>> "advance"
>>>>>>>>>>>>> - but once I understand the algorithm better I might have better
>>>>>>>>>>>>> suggestions)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The function removes the cached result for the first TypoExpr
>>>>>>>>>>>> then checks the TypoExpr's correction candidate stream; if it has 
>>>>>>>>>>>> reached
>>>>>>>>>>>> the end of the stream, it will reset the stream then perform the 
>>>>>>>>>>>> same check
>>>>>>>>>>>> for the second TypoExpr, continuing until it either has reset the
>>>>>>>>>>>> correction streams for all of the TypoExprs or has found a 
>>>>>>>>>>>> TypoExpr that
>>>>>>>>>>>> hadn't reached the end of its correction stream. And in the former 
>>>>>>>>>>>> case, it
>>>>>>>>>>>> would signal that all combinations of corrections had been retried 
>>>>>>>>>>>> and the
>>>>>>>>>>>> TreeTransform should stop searching for a valid combination of 
>>>>>>>>>>>> corrections.
>>>>>>>>>>>> And now that I think about it, the "E != TypoExprs[0]" is probably 
>>>>>>>>>>>> now a
>>>>>>>>>>>> superfluous check since calling 
>>>>>>>>>>>> CheckAndAdvanceTypoExprCorrectionStreams
>>>>>>>>>>>> would always remove the first TypoExpr from the result cache.
>>>>>>>>>>>>
>>>>>>>>>>>> The loop at the end ("for (auto E : TypoExprs) {") is over a 
>>>>>>>>>>>> hashing data structure, right? Is that going to cause unstable 
>>>>>>>>>>>> output of diagnostics? (eg: different machines, different memory 
>>>>>>>>>>>> layout, the same program might get the typos printed in a 
>>>>>>>>>>>> different order) or is there something else that will fix that?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you have to change the set to be a SetVector, or similar (to 
>>>>>>>>>>>>>>> stabilize output) - then you won't need the FirstTypoExpr 
>>>>>>>>>>>>>>> anymore - it'll just be the first one in the set, since you 
>>>>>>>>>>>>>>> iterate in insertion order at that point.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Switched it from a SmallPtrSet to a SmallSetVector like it
>>>>>>>>>>>>>> should have been. Doing so also allowed me to simplify the state 
>>>>>>>>>>>>>> handling
>>>>>>>>>>>>>> code by being able to assume a deterministic order, and 
>>>>>>>>>>>>>> eliminating
>>>>>>>>>>>>>> FirstTypoExpr.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> "while (TypoCorrection TC = 
>>>>>>>>>>>>>>> State.Consumer->getNextCorrection()) {" doesn't seem to be a 
>>>>>>>>>>>>>>> loop - it has an unconditional return at the end of the body. 
>>>>>>>>>>>>>>> Should it just be an 'if'?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There was actually a missing check on whether the result of
>>>>>>>>>>>>>> Sema::BuildDeclarationNameExpr was valid.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Were there missing/added test cases to cover this situation?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> No because at this point in the patch series, there is nothing
>>>>>>>>>>>> that actually generates TypoExprs.
>>>>>>>>>>>>
>>>>>>>>>>>>> "+ (FullExpr.get()->isTypeDependent() ||
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +       FullExpr.get()->isValueDependent() ||
>>>>>>>>>>>>>>> +       FullExpr.get()->isInstantiationDependent())) {"
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What are those checks for? I don't know what condition we would 
>>>>>>>>>>>>>>> be in if we had typo corrections but it wasn't type, value, or 
>>>>>>>>>>>>>>> instantiation dependent. Perhaps a comment could explain that?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The check is to avoid running the tree transform on Expr
>>>>>>>>>>>>>> trees that are known to not have a TypoExpr in them, regardless 
>>>>>>>>>>>>>> of whether
>>>>>>>>>>>>>> the current expression evaluation context still indicates there 
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>> uncorrected typos. I've added a comment to that effect.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hmm - I suppose my question might then be: When would we have
>>>>>>>>>>>>> NumTypos > 0 but be in a situation where the full expression 
>>>>>>>>>>>>> cannot have
>>>>>>>>>>>>> any typos in it?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I don't remember off the top of my head because it's been a few
>>>>>>>>>>>> months now since I wrote the original patches... but basically any
>>>>>>>>>>>> situation where more than one FullExpr shares an expression 
>>>>>>>>>>>> evaluation
>>>>>>>>>>>> context or when the current expression evaluation context had its 
>>>>>>>>>>>> NumTypos
>>>>>>>>>>>> increased by an expression evaluation context that was popped off 
>>>>>>>>>>>> the
>>>>>>>>>>>> stack. And TreeTransforms aren't cheap enough to blindly run on 
>>>>>>>>>>>> every Expr
>>>>>>>>>>>> when NumTypos is non-zero.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> To simplify Sema::clearDelayedTypo you could add 
>>>>>>>>>>>>>> MapVector::erase(const KeyType&) (given that std::map has such 
>>>>>>>>>>>>>> an operation, it seems like a reasonable one for MapVector to 
>>>>>>>>>>>>>> have).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Done (simple patch attached).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Great - if you'd like to commit this separately, it should
>>>>>>>>>>>>> have a unit test. Otherwise it's probably simple enough to commit 
>>>>>>>>>>>>> along
>>>>>>>>>>>>> with your new usage if you prefer.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Committed with unit test as r219240.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sema::getTypoExprState looks like it could be a const function
>>>>>>>>>>>>>>> (it looks like it's never meant to be called when the element 
>>>>>>>>>>>>>>> isn't already
>>>>>>>>>>>>>>> in the map) - using 'lookup' instead of operator[] will be a 
>>>>>>>>>>>>>>> const
>>>>>>>>>>>>>>> operation on the map (& assert if the element doesn't exist in 
>>>>>>>>>>>>>>> the map).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sadly I cannot do that because the operator[] cannot be used
>>>>>>>>>>>>>> if the method is made const, and .lookup() can't be used because 
>>>>>>>>>>>>>> it returns
>>>>>>>>>>>>>> by-value instead of by-reference and TypoExprState is not 
>>>>>>>>>>>>>> copyable:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ../tools/clang/lib/Sema/SemaLookup.cpp:4600:10: warning:
>>>>>>>>>>>>>> returning reference to local temporary object 
>>>>>>>>>>>>>> [-Wreturn-stack-address]
>>>>>>>>>>>>>>   return DelayedTypos.lookup(TE);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ../include/llvm/ADT/MapVector.h:88:30: error: call to
>>>>>>>>>>>>>> implicitly-deleted copy constructor of 'const 
>>>>>>>>>>>>>> clang::Sema::TypoExprState'
>>>>>>>>>>>>>>     return Pos == Map.end()? ValueT() :
>>>>>>>>>>>>>> Vector[Pos->second].second;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Oh, weird - my mistake in assuming/misremembering what lookup
>>>>>>>>>>>>> actually does. I'd still make getTypoExprState const and use 
>>>>>>>>>>>>> find+assert
>>>>>>>>>>>>> just to constrain the interface a bit more rather than leaving it 
>>>>>>>>>>>>> possible
>>>>>>>>>>>>> to accidentally create a new entry in the map when the API is 
>>>>>>>>>>>>> only intended
>>>>>>>>>>>>> to query existing entries.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Done. The new changes I've made overall are:
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/include/clang/Sema/Sema.h
>>>>>>>>>>>> b/include/clang/Sema/Sema.h
>>>>>>>>>>>> index b30bf5e..f644fdd 100644
>>>>>>>>>>>> --- a/include/clang/Sema/Sema.h
>>>>>>>>>>>> +++ b/include/clang/Sema/Sema.h
>>>>>>>>>>>> @@ -2628,7 +2628,7 @@ private:
>>>>>>>>>>>>                               bool ErrorRecovery, bool
>>>>>>>>>>>> &IsUnqualifiedLookup);
>>>>>>>>>>>>
>>>>>>>>>>>>  public:
>>>>>>>>>>>> -  const TypoExprState &getTypoExprState(TypoExpr *TE);
>>>>>>>>>>>> +  const TypoExprState &getTypoExprState(TypoExpr *TE) const;
>>>>>>>>>>>>
>>>>>>>>>>>>    /// \brief Clears the state of the given TypoExpr.
>>>>>>>>>>>>    void clearDelayedTypo(TypoExpr *TE);
>>>>>>>>>>>> diff --git a/lib/Sema/SemaExprCXX.cpp b/lib/Sema/SemaExprCXX.cpp
>>>>>>>>>>>> index 73aa997..1f08276 100644
>>>>>>>>>>>> --- a/lib/Sema/SemaExprCXX.cpp
>>>>>>>>>>>> +++ b/lib/Sema/SemaExprCXX.cpp
>>>>>>>>>>>> @@ -6028,7 +6028,7 @@ public:
>>>>>>>>>>>>      // If the TypoExpr hasn't been seen before, record it.
>>>>>>>>>>>> Otherwise, return the
>>>>>>>>>>>>      // cached transformation result if there is one and the
>>>>>>>>>>>> TypoExpr isn't the
>>>>>>>>>>>>      // first one that was encountered.
>>>>>>>>>>>> -    if (!TypoExprs.insert(E) && E != TypoExprs[0] &&
>>>>>>>>>>>> TransformCache.count(E)) {
>>>>>>>>>>>> +    if (!TypoExprs.insert(E) && TransformCache.count(E)) {
>>>>>>>>>>>>        return TransformCache[E];
>>>>>>>>>>>>      }
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/lib/Sema/SemaLookup.cpp b/lib/Sema/SemaLookup.cpp
>>>>>>>>>>>> index 8ddb5fd..bb6c76a 100644
>>>>>>>>>>>> --- a/lib/Sema/SemaLookup.cpp
>>>>>>>>>>>> +++ b/lib/Sema/SemaLookup.cpp
>>>>>>>>>>>> @@ -4601,8 +4601,11 @@ TypoExpr
>>>>>>>>>>>> *Sema::createDelayedTypo(std::unique_ptr<TypoCorrectionConsumer> 
>>>>>>>>>>>> TCC,
>>>>>>>>>>>>    return TE;
>>>>>>>>>>>>  }
>>>>>>>>>>>>
>>>>>>>>>>>> -const Sema::TypoExprState &Sema::getTypoExprState(TypoExpr
>>>>>>>>>>>> *TE) {
>>>>>>>>>>>> -  return DelayedTypos[TE];
>>>>>>>>>>>> +const Sema::TypoExprState &Sema::getTypoExprState(TypoExpr
>>>>>>>>>>>> *TE) const {
>>>>>>>>>>>> +  auto Entry = DelayedTypos.find(TE);
>>>>>>>>>>>> +  assert(Entry != DelayedTypos.end() &&
>>>>>>>>>>>> +         "Failed to get the state for a TypoExpr!");
>>>>>>>>>>>> +  return Entry->second;
>>>>>>>>>>>>  }
>>>>>>>>>>>>
>>>>>>>>>>>>  void Sema::clearDelayedTypo(TypoExpr *TE) {
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Otherwise this all looks like rather neat stuff - though I'm
>>>>>>>>>>>>>>> not as able to review in detail some of the refactoring of 
>>>>>>>>>>>>>>> existing typo
>>>>>>>>>>>>>>> correction stuff. It looks plausible.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've attached the updated patch. Thanks for reviewing this,
>>>>>>>>>>>>>> David!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Sep 25, 2014 at 2:00 PM, Kaelyn Takata <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ping
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 26, 2014 at 11:05 AM, Kaelyn Takata <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +dblaikie
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Jul 31, 2014 at 1:20 PM, Kaelyn Takata <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Part of the infrastructure is a map from a TypoExpr to
>>>>>>>>>>>>>>>>>> the Sema-specific
>>>>>>>>>>>>>>>>>> state needed to correct it, along with helpers to ease
>>>>>>>>>>>>>>>>>> dealing with the
>>>>>>>>>>>>>>>>>> state.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The the typo count is propagated up the stack of
>>>>>>>>>>>>>>>>>> ExpressionEvaluationContextRecords when one is popped off
>>>>>>>>>>>>>>>>>> of to
>>>>>>>>>>>>>>>>>> avoid accidentally dropping TypoExprs on the floor. For
>>>>>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>>>>>> the attempted correction of g() in
>>>>>>>>>>>>>>>>>> test/CXX/class/class.mem/p5-0x.cpp
>>>>>>>>>>>>>>>>>> happens with an ExpressionEvaluationContextRecord that is
>>>>>>>>>>>>>>>>>> popped off
>>>>>>>>>>>>>>>>>> the stack prior to ActOnFinishFullExpr being called and
>>>>>>>>>>>>>>>>>> the tree
>>>>>>>>>>>>>>>>>> transform for TypoExprs being run.
>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>  include/clang/Sema/Sema.h           |  44 +++++
>>>>>>>>>>>>>>>>>>  include/clang/Sema/SemaInternal.h   |  15 +-
>>>>>>>>>>>>>>>>>>  include/clang/Sema/TypoCorrection.h |   2 +-
>>>>>>>>>>>>>>>>>>  lib/Sema/SemaExpr.cpp               |   7 +
>>>>>>>>>>>>>>>>>>  lib/Sema/SemaExprCXX.cpp            | 108 ++++++++++++
>>>>>>>>>>>>>>>>>>  lib/Sema/SemaLookup.cpp             | 316
>>>>>>>>>>>>>>>>>> ++++++++++++++++++++++++------------
>>>>>>>>>>>>>>>>>>  6 files changed, 384 insertions(+), 108 deletions(-)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH v4 5/8] Start adding the infrastructure for handling TypoExprs.

Reply via email to