Re: [PATCH v4 5/8] Start adding the infrastructure for handling TypoExprs.

David Blaikie Mon, 27 Oct 2014 11:30:07 -0700

On Mon, Oct 27, 2014 at 11:12 AM, Kaelyn Takata <[email protected]> wrote:


>
>
> On Mon, Oct 27, 2014 at 10:39 AM, David Blaikie <[email protected]>
> wrote:
>
>> Looks good, please commit whenever you're ready.
>>
>
> Awesome, thanks! And thank you for taking the time to review this patch
> set! :)
>

Thanks for your patience!


>
>> I think it might be nice to explore modelling the Consumer as a range
>> exposing iterators, etc rather than having internal state
>> (getNextCorrection+finished+resetCorrectionStream), leveraging existing C++
>> idioms like this may make the code a bit easier to understand at first
>> glance (maybe, anyway - it'd also provide the usual benefits of the
>> iterator pattern (separation of concerns between the sequence and its
>> iteration, possibility of multiple active iterations, etc) but I don't know
>> that any of those would be particularly important/useful in this particular
>> situation at the moment)
>>
>
> I agree that the modelling of the set/stream of correction candidates
> obtainable from the Consumer could possibly be done better, and I have some
> thoughts in mind on how to improve and simplify the interface a bit.
> However, I want to get it working and the delayed typo correction wired up
> in more places before trying to simplify the Consumer's
> correction-retrieval interface so that as much of the needed functionality
> as possible will be known when doing the refactoring
>

Yep, totally!


> (i.e. the next patch set makes several hefty changes to TransformTypoExpr
> to allow it to correctly handle new situations that were exposed by hooking
> the delayed typo correction to DiagnoseEmptyLookup).
>
>
>> On Mon, Oct 27, 2014 at 9:57 AM, Kaelyn Takata <[email protected]> wrote:
>>
>>> Here's the latest version of just this patch. The new test case is in
>>> patch 8/8.
>>>
>>> On Thu, Oct 23, 2014 at 11:23 AM, David Blaikie <[email protected]>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Oct 23, 2014 at 11:14 AM, Kaelyn Takata <[email protected]>
>>>> wrote:
>>>>
>>>>> Well, I've found a test case that would trigger the assertion but in
>>>>> the code's current form does the right thing:
>>>>>
>>>>
>>>> Great!
>>>>
>>>>
>>>>>
>>>>> class Foo {
>>>>> public:
>>>>>   template <> void testIt();
>>>>>   void textIt();
>>>>> };
>>>>>
>>>>> void test(Foo *f) {
>>>>>   f->TestIt();
>>>>> }
>>>>>
>>>>>
>>>>> With the patchset in it's current incarnation (including the fix I
>>>>> mentioned in my last email):
>>>>>
>>>>
>>>> I didn't quite follow the description of the other fix/issue there -
>>>> few too many patches in my head. Could you send the latest version of this
>>>> patch under review?
>>>>
>>>>
>>>>>
>>>>> fail.cc:3:20: error: no function template matches function template
>>>>> specialization 'testIt'
>>>>>   template <> void testIt();
>>>>>                    ^
>>>>> fail.cc:8:6: error: no member named 'TestIt' in 'Foo'; did you mean
>>>>> 'textIt'?
>>>>>   f->TestIt();
>>>>>      ^~~~~~
>>>>>      textIt
>>>>> fail.cc:4:8: note: 'textIt' declared here
>>>>>   void textIt();
>>>>>        ^
>>>>> 2 errors generated.
>>>>>
>>>>>
>>>>> With the "if (!NE.isInvalid())" as an assertion:
>>>>>
>>>>> fail.cc:3:20: error: no function template matches function template
>>>>> specialization 'testIt'
>>>>>   template <> void testIt();
>>>>>                    ^
>>>>> clang-3.6: ../tools/clang/lib/Sema/SemaExprCXX.cpp:6055: ExprResult
>>>>> (anonymous namespace)::TransformTypos::TransformTypoExpr(clang::TypoExpr
>>>>> *): Assertion `!NE.isInvalid()' failed.
>>>>>
>>>>>
>>>>> Note that the (invalid) function template "testIt" has an edit
>>>>> distance of 1 from "TestIt" and the proper suggestion, "textIt", has an
>>>>> edit distance of 2. I've added the above test case to
>>>>> typo-correction-delayed.cpp.
>>>>>
>>>>>
>>>>> On Thu, Oct 23, 2014 at 11:05 AM, David Blaikie <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> (sorry for the repetition - I'm failing at 'reply all' this morning)
>>>>>>
>>>>>> On Thu, Oct 23, 2014 at 10:53 AM, Kaelyn Takata <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> On Wed, Oct 22, 2014 at 2:19 PM, David Blaikie <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Oct 21, 2014 at 1:37 PM, Kaelyn Takata <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Oct 17, 2014 at 9:52 AM, David Blaikie <[email protected]
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Oct 9, 2014 at 10:12 AM, David Blaikie <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Oct 9, 2014 at 9:58 AM, Kaelyn Takata <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Oct 7, 2014 at 5:03 PM, David Blaikie <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Oct 7, 2014 at 2:29 PM, Kaelyn Takata <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Oct 7, 2014 at 11:43 AM, David Blaikie <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Oct 6, 2014 at 3:46 PM, Kaelyn Takata <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Oct 3, 2014 at 11:45 AM, David Blaikie <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Given that these are parts of a patch series, some of
>>>>>>>>>>>>>>>>> which have already been signed off on/I haven't looked at, 
>>>>>>>>>>>>>>>>> it's a bit hard
>>>>>>>>>>>>>>>>> to review the overall design - but I believe you & Richard 
>>>>>>>>>>>>>>>>> talked that
>>>>>>>>>>>>>>>>> over, so I'm not too worried about that (just explaining why 
>>>>>>>>>>>>>>>>> my review
>>>>>>>>>>>>>>>>> feedback's going to be a bit narrow)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +  if (NumTypos > 0 && !ExprEvalContexts.empty())
>>>>>>>>>>>>>>>>> +    ExprEvalContexts.back().NumTypos += NumTypos;
>>>>>>>>>>>>>>>>> +  else
>>>>>>>>>>>>>>>>> +    assert(NumTypos == 0 && "There are outstanding typos 
>>>>>>>>>>>>>>>>> after popping the "
>>>>>>>>>>>>>>>>> +                            "last 
>>>>>>>>>>>>>>>>> ExpressionEvaluationContextRecord");
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Could this be simplified? I think it's equivalent to:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   if (!ExprEvalContexts.empty())
>>>>>>>>>>>>>>>>>     ExprEvalContexts.back().NumTypos += NumTypos;
>>>>>>>>>>>>>>>>>   else
>>>>>>>>>>>>>>>>>     assert(NumTypos == 0 ... );
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> or:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   assert(ExprEvalContexts.empty() || NumTypos != 0 && ...)
>>>>>>>>>>>>>>>>>   if (!ExprEvalContexts.empty())
>>>>>>>>>>>>>>>>>     ...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes, the "NumTypos > 0" was unnecessary and has been
>>>>>>>>>>>>>>>> removed. (FYI, the second form you suggested doesn't quite 
>>>>>>>>>>>>>>>> work because the
>>>>>>>>>>>>>>>> assert would fail on the common case where ExprEvalContexts is 
>>>>>>>>>>>>>>>> not empty
>>>>>>>>>>>>>>>> and there are no typos.)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hmm, right, maybe:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   assert(ExprEvalContexts.empty() ^ NumTypos != 0 && ... )
>>>>>>>>>>>>>>>   if (!ExprEvalContexts.empty())
>>>>>>>>>>>>>>>     ExprEvalContexts.back().NumTypos += NumTypos;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> but perhaps that's just too confusing. I just find an
>>>>>>>>>>>>>>> assertion as the only thing inside a block of code to be a bit 
>>>>>>>>>>>>>>> off... but
>>>>>>>>>>>>>>> it's certainly not unprecedented in LLVM and is probably more 
>>>>>>>>>>>>>>> legible -
>>>>>>>>>>>>>>> just my own hangup to get over :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There's an unnecessary semicolon at the end of the "while 
>>>>>>>>>>>>>>>> (true)" loop in TransformTypos::Transform
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Removed.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'd roll this up a bit:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +      auto &State = SemaRef.getTypoExprState(TE);
>>>>>>>>>>>>>>>>> +      if (State.DiagHandler)
>>>>>>>>>>>>>>>>> +        
>>>>>>>>>>>>>>>>> (*State.DiagHandler)(State.Consumer->getCurrentCorrection());
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> into:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   if (auto *Handler = 
>>>>>>>>>>>>>>>>> SemaRef.getTypoExprState(TE).DiagHandler)
>>>>>>>>>>>>>>>>>     (*DiagHandler)(State.Consumer->getCurrentCorrection());
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> That rollup doesn't work because State would be unknown
>>>>>>>>>>>>>>>> when accessing State.Consumer in the argument to the 
>>>>>>>>>>>>>>>> DiagHandler.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Oh, right!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> count in the condition, plus insert in the body of the if in 
>>>>>>>>>>>>>>>> TransformTypos::TransformTypoExpr would be more efficient as a 
>>>>>>>>>>>>>>>> single insert-and-check (by avoiding two separate lookups into 
>>>>>>>>>>>>>>>> the set):
>>>>>>>>>>>>>>>>>   if (TypoExprs.insert(E).second) // means this element 
>>>>>>>>>>>>>>>>> previously didn't exist in the set, and now does
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've changed the count-then-insert to just an insert since
>>>>>>>>>>>>>>>> SmallPtrSetImpl::insert returns true when the item wasn't 
>>>>>>>>>>>>>>>> already in the
>>>>>>>>>>>>>>>> set.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   TransformCache.find(E) != TransformCache.end()
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> might commonly be written as:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   TransformCache.count(E)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Optionally with "!= 0", but often count is used directly as a 
>>>>>>>>>>>>>>>>> boolean test, especially when it's a single set/map (not a 
>>>>>>>>>>>>>>>>> multi set/map).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Changed.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This comment didn't quite make sense to me:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +      // If corrections for the first TypoExpr have been 
>>>>>>>>>>>>>>>>> exhausted, retry them
>>>>>>>>>>>>>>>>> +      // against the next combination of substitutions for 
>>>>>>>>>>>>>>>>> all of the other
>>>>>>>>>>>>>>>>> +      // TypoExprs.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> When you say "retry them" which them do you mean? The 
>>>>>>>>>>>>>>>>> corrections for the first TypoExpr? But what does it mean for 
>>>>>>>>>>>>>>>>> them to be exhausted then, if they can still be considered?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Maybe I need a bit of help with the higher level algorithm 
>>>>>>>>>>>>>>>>> here - I assume the idea is:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Call transform.
>>>>>>>>>>>>>>>>> As it visits TypoExpr nodes, it tries their first typo 
>>>>>>>>>>>>>>>>> candidate available.
>>>>>>>>>>>>>>>>> If we get to the end and have invalid code, reset the state 
>>>>>>>>>>>>>>>>> of all the typo corrections except the first one (on its next 
>>>>>>>>>>>>>>>>> query, it'll give back its second alternative).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Here is where the confusion is at. As the transform visits
>>>>>>>>>>>>>>>> each TypoExpr node it tries the first candidate available. If 
>>>>>>>>>>>>>>>> the first
>>>>>>>>>>>>>>>> candidates of all the TypoExprs don't work, it then tries the 
>>>>>>>>>>>>>>>> next (and
>>>>>>>>>>>>>>>> subsequent) candidates of only the first TypoExpr that was 
>>>>>>>>>>>>>>>> seen. When it
>>>>>>>>>>>>>>>> exhausts the candidates from the first TypoExpr, it then 
>>>>>>>>>>>>>>>> resets the stream
>>>>>>>>>>>>>>>> of candidates for the first TypoExpr and fetches the next 
>>>>>>>>>>>>>>>> candidate of the
>>>>>>>>>>>>>>>> TypoExpr, continuing in this way until all combinations of 
>>>>>>>>>>>>>>>> candidates of
>>>>>>>>>>>>>>>> all TypoExprs have been tried.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So I'm still a bit confused about which parts of the retry
>>>>>>>>>>>>>>> logic are in which parts of the code.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> OK, I think I've got it. Transform is called and we don't
>>>>>>>>>>>>>>> even know which TypoExprs are in the Expr. So we transform.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As transform runs it comes across TypoExprs, and attempts to
>>>>>>>>>>>>>>> transform them. For each one, it tries candidates until it 
>>>>>>>>>>>>>>> finds a good one
>>>>>>>>>>>>>>> (based on the immediate context - what are the cases where
>>>>>>>>>>>>>>> BuildDeclarationNameExpr fails here? (when does NE.isInvalid 
>>>>>>>>>>>>>>> fail?)).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think there is a bit of confusion on what a "retry" is.
>>>>>>>>>>>>>> Within TransformTypoExpr, for either the first TypoExpr ever 
>>>>>>>>>>>>>> encountered by
>>>>>>>>>>>>>> this TreeTransform or for a TypoExpr without a cached 
>>>>>>>>>>>>>> substitution, it
>>>>>>>>>>>>>> tries to build a valid subexpression from the TypoCorrection 
>>>>>>>>>>>>>> candidates
>>>>>>>>>>>>>> (e.g. a DeclRefExpr) and continues trying candidates until it 
>>>>>>>>>>>>>> finds one for
>>>>>>>>>>>>>> which a valid subexpression can be formed, which it caches and 
>>>>>>>>>>>>>> returns. If
>>>>>>>>>>>>>> none of the candidates worked, it will cache and return an 
>>>>>>>>>>>>>> invalid
>>>>>>>>>>>>>> ExprResult. The TreeTransform then takes the returned 
>>>>>>>>>>>>>> subexpression and
>>>>>>>>>>>>>> tries to rebuild the full expression (possibly trying to 
>>>>>>>>>>>>>> transform multiple
>>>>>>>>>>>>>> TypoExprs along with CallExprs, etc).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't remember off the top of my head without looking at
>>>>>>>>>>>>>> the body, but I think BuildDeclarationNameExpr will fail in 
>>>>>>>>>>>>>> cases where the
>>>>>>>>>>>>>> correction is e.g. a type name, or a member reference 
>>>>>>>>>>>>>> name--basically,
>>>>>>>>>>>>>> BuildDeclarationNameExpr returns an ExprResult and the correct 
>>>>>>>>>>>>>> behavior in
>>>>>>>>>>>>>> principle is to check for an invalid result and try the next 
>>>>>>>>>>>>>> correction
>>>>>>>>>>>>>> candidate when one is seen--even if the Consumer never returns a 
>>>>>>>>>>>>>> candidate
>>>>>>>>>>>>>> for which BuildDeclarationNameExpr would fail.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, in your example below, maybe the right answer is BEH,
>>>>>>>>>>>>>>> but AF is valid, with no correct candidate for the 3 choice.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, AF with no correction for the 3rd TypoExpr would result
>>>>>>>>>>>>>> in the entire Expr result being invalid within the main 
>>>>>>>>>>>>>> Transform method.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Consumers start at "before the first" element, calling
>>>>>>>>>>>>>>> "getNextCorrection" on a new consumer gives you the first 
>>>>>>>>>>>>>>> correction.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That is correct. And the "before the first" element is also
>>>>>>>>>>>>>> an empty correction for use with getCurrentCorrection. ;)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On the first call to TransformExpr, the first call to
>>>>>>>>>>>>>>> TransformTypoExpr chooses A, records it, continues.
>>>>>>>>>>>>>>> Gets to the second TransformTypoExpr, tries D and E,
>>>>>>>>>>>>>>> succeeds at F.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it would only try D then E then F within a single call to
>>>>>>>>>>>>>> TransformTypoExpr if BuildDeclarationNameExpr failed for D and E.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On the third TransformTypoExpr, it tries H and I, both fail,
>>>>>>>>>>>>>>> so it returns from the last line ("return TransformCache[E] = 
>>>>>>>>>>>>>>> ExprError]).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Same case here--that last line would only be reached if H and
>>>>>>>>>>>>>> I both failed within the call to BuildDeclarationNameExpr.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We get back to Transform with a failure. So we walk the
>>>>>>>>>>>>>>> TypoExprs (in CheckAndAdvanceTypoExprCorrectionStreams) we 
>>>>>>>>>>>>>>> reset the 3rd
>>>>>>>>>>>>>>> consumer (because it was finished).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, CheckAndAdvanceTypoExprCorrectionStreams would do nothing
>>>>>>>>>>>>>> if the first TypoExpr is not finished. If the first TypoExpr is 
>>>>>>>>>>>>>> finished,
>>>>>>>>>>>>>> then it would reset it and clear the cached result for the 
>>>>>>>>>>>>>> second TypoExpr
>>>>>>>>>>>>>> (so that the next call to TransformTypoExpr on the second 
>>>>>>>>>>>>>> TypoExpr would
>>>>>>>>>>>>>> try a new candidate instead of using the cached result). Only if 
>>>>>>>>>>>>>> both the
>>>>>>>>>>>>>> first and second TypoExprs were *both* finished would the cached 
>>>>>>>>>>>>>> value of
>>>>>>>>>>>>>> the third TypoExpr be cleared (and the candidate streams for 
>>>>>>>>>>>>>> both the first
>>>>>>>>>>>>>> and second TypoExprs would be reset in addition to their cached 
>>>>>>>>>>>>>> results
>>>>>>>>>>>>>> being removed from the cache). The only way
>>>>>>>>>>>>>> CheckAndAdvanceTypoExprCorrectionStreams would see if the third 
>>>>>>>>>>>>>> TypoExpr
>>>>>>>>>>>>>> was finished would be if both the first and second TypoExprs 
>>>>>>>>>>>>>> were also
>>>>>>>>>>>>>> finished, and upon resetting it the tree transform would only be 
>>>>>>>>>>>>>> attempted
>>>>>>>>>>>>>> again if there were at least four known TypoExprs (i.e. the 
>>>>>>>>>>>>>> number of known
>>>>>>>>>>>>>> TypoExprs is greater than the number of TypoExprs that were 
>>>>>>>>>>>>>> finished and
>>>>>>>>>>>>>> had to be reset by CheckAndAdvanceTypoExprCorrectionStreams).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We transform again:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> First typo) We are at A so we call next and get B
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes. The first TypoExpr would always have it's next candidate
>>>>>>>>>>>>>> tried by a call to TransformTypoExpr.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Second typo) From F we continue and try G, fail, then we're
>>>>>>>>>>>>>>> finished for the second typo.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, G would only be tried if the result from transforming F
>>>>>>>>>>>>>> was removed from the cache.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Third typo) Perhaps we find some solution for this, or not,
>>>>>>>>>>>>>>> it doesn't matter.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We get back to Transform with a failure (second typo), So we
>>>>>>>>>>>>>>> walk the TypoExprs again, resetting the second (and maybe the 
>>>>>>>>>>>>>>> third).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If there are only three TypoExprs, then resetting the third
>>>>>>>>>>>>>> TypoExpr would mean there are no valid combinations of 
>>>>>>>>>>>>>> corrections that
>>>>>>>>>>>>>> yield a valid expression (where the final result of the 
>>>>>>>>>>>>>> transform is
>>>>>>>>>>>>>> neither an invalid ExprResult nor caused any errors to be 
>>>>>>>>>>>>>> reported by the
>>>>>>>>>>>>>> SFINAETrap).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> First typo) We are at B so we call next and get to C, let's
>>>>>>>>>>>>>>> say we fail here - C isn't even remotely valid.
>>>>>>>>>>>>>>> Now it really matters whether we find/fail on the second and
>>>>>>>>>>>>>>> third typo. If we fail on the second and third... we're going 
>>>>>>>>>>>>>>> to give up.
>>>>>>>>>>>>>>> We'll find that all the corrections are finished and we'll 
>>>>>>>>>>>>>>> conclude that
>>>>>>>>>>>>>>> we've tried everything.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am I following this right?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Not quite. I think you are forgetting about the fact that,
>>>>>>>>>>>>>> for all TypoExprs except the first, a cached result is used if 
>>>>>>>>>>>>>> it exists
>>>>>>>>>>>>>> instead of fetching a new candidate from the Consumer and trying 
>>>>>>>>>>>>>> to build a
>>>>>>>>>>>>>> valid Expr with it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ah, looking at the code again I see what you mean. The 'break'
>>>>>>>>>>>>> is the bit that I missed in the reset/advance loop.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Which patches would I have to apply to ToT to be able to
>>>>>>>>>>>>> exercise this code? It might help for me to step through some of 
>>>>>>>>>>>>> these
>>>>>>>>>>>>> scenarios in a debugger.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> To exercise this code, you'd need the full patch series (all 8)
>>>>>>>>>>>> to exercise this code because it's not until patch #8 that anything
>>>>>>>>>>>> actually generates TypoExprs in the AST. I can send you a combined 
>>>>>>>>>>>> patch
>>>>>>>>>>>> rebased on ToT off-list if you'd like (it's ~112KB uncompressed), 
>>>>>>>>>>>> and the
>>>>>>>>>>>> new test cases in test/SemaCXX/typo-correction-delayed.cpp 
>>>>>>>>>>>> explicitly test
>>>>>>>>>>>> this functionality.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> That'd be great!
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> With your total patch applied I've been able to reproduce a few
>>>>>>>>>> typo correction scenarios - thanks!
>>>>>>>>>>
>>>>>>>>>> But I can't seem to reach/exercise the loop in
>>>>>>>>>> TransformTypos::TransformTypoExpr - if I change the "if 
>>>>>>>>>> (!NE.isInvalid())"
>>>>>>>>>> condition at the end of the loop (the only way to repeat the body of 
>>>>>>>>>> the
>>>>>>>>>> loop) to "assert(!NE.isInvalid())" it doesn't fire on any of the 
>>>>>>>>>> test cases
>>>>>>>>>> (nor on my own modified versions). Do you have some tests that cover 
>>>>>>>>>> this?
>>>>>>>>>> Could you show/describe a simple situation where it occurs so I can 
>>>>>>>>>> play
>>>>>>>>>> around with that part of the logic?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm not sure it can be triggered with CorrectTypoDelayed only
>>>>>>>>> hooked up to LookupMemberExprInRecord (a static function called by
>>>>>>>>> Sema::BuildMemberReferenceExpr directly and indirectly via
>>>>>>>>> LookupMemberExpr), but in the second patch set which hooks
>>>>>>>>> CorrectTypoDelayed up to DiagnoseEmptyLookup, the assert is triggered 
>>>>>>>>> by
>>>>>>>>> the tests SemaCXX/type-definition-in-specifier.cpp
>>>>>>>>> and SemaCXX/warn-thread-safety-parsing.cpp (in addition to the loop 
>>>>>>>>> being
>>>>>>>>> the correct/principled behavior since there is no guarantee that the
>>>>>>>>> recovery callback will generate a valid ExprResult from a non-empty
>>>>>>>>> TypoCorrection). For this patch set, to trigger the assert a code 
>>>>>>>>> snippet
>>>>>>>>> would have to correct a class member lookup and have a candidate that 
>>>>>>>>> is a
>>>>>>>>> member function, constant, variable, or template in the current class 
>>>>>>>>> or
>>>>>>>>> one of its base classes (per the RecordMemberExprValidatorCCC) yet 
>>>>>>>>> which
>>>>>>>>> yields causes MemberExprTypoRecovery to return an ExprError when 
>>>>>>>>> trying to
>>>>>>>>> build the member reference. Basically, it would require having the 
>>>>>>>>> typo
>>>>>>>>> correction try a candidate that is an accessible class member that 
>>>>>>>>> cannot
>>>>>>>>> be reference--something that I'm not sure is even possible in C++.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Hrm, OK - could you change it to an assertion (& remove the outer
>>>>>>>> loop/replace it with a conditional) until that later patch where it 
>>>>>>>> becomes
>>>>>>>> necessary, so it'll be easier to review it in the context of it being
>>>>>>>> executed/used code?
>>>>>>>>
>>>>>>>
>>>>>>> I understand your suggestion, however I don't think changing the
>>>>>>> code to do the wrong thing and assert if a false assumption is violated 
>>>>>>> is
>>>>>>> the right thing to do. If the only source of an ExprResult in the loop 
>>>>>>> was
>>>>>>> the call to Sema::BuildDeclarationNameExpr, I'd be a bit more willing 
>>>>>>> to do
>>>>>>> so... but if a recovery handler callback is provided and invoked, 
>>>>>>> there's
>>>>>>> no guarantee it won't return an invalid ExprResult, and no 
>>>>>>> documentation of
>>>>>>> the assumption that the callback must return a valid ExprResult 
>>>>>>> (outside of
>>>>>>> the one assertion you are suggesting, buried within the TypoExpr 
>>>>>>> handling
>>>>>>> machinery). Additionally, changing the code to do the wrong thing by
>>>>>>> changing the "while (...)" to "if (...)" and replacing "if
>>>>>>> (!NE.isInvalid())" with a bogus assertion does nothing to simplify the
>>>>>>> actual code block... particularly since it would immediately be changed
>>>>>>> right back to the way it is now in the next patch set (a version of 
>>>>>>> which I
>>>>>>> had already mailed out two months ago).
>>>>>>>
>>>>>>
>>>>>> My concern is that this is untested (& actually untestable/dead in
>>>>>> this revision) code - if someone were to sync to this revision and try to
>>>>>> understand what the code did, why it was there, and how it should be
>>>>>> modified to do other things, they'd be confused & probably get it wrong
>>>>>> (because there are no tests to demonstrate the motivation). It's nice for
>>>>>> code to go along with the use - the usual rules of providing tests along
>>>>>> with code.
>>>>>>
>>>>>> But I'll muddle through & try to remember to figure out the tests &
>>>>>> the design when you commit/review the next patches if necessary.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> (FYI, I also just noticed I had a mistake in the full patch I sent
>>>>>>> you... specifically I seem to have forgotten to delete a return 
>>>>>>> statement
>>>>>>> when I changed the code to look up the TypoExpr in the TransformCache 
>>>>>>> only
>>>>>>> one and assign the result to the created entry and rebased the rest of 
>>>>>>> the
>>>>>>> patch set:
>>>>>>>
>>>>>>> diff --git a/lib/Sema/SemaExprCXX.cpp b/lib/Sema/SemaExprCXX.cpp
>>>>>>> index 81b3b01..9e05e82 100644
>>>>>>> --- a/lib/Sema/SemaExprCXX.cpp
>>>>>>> +++ b/lib/Sema/SemaExprCXX.cpp
>>>>>>> @@ -6049,8 +6049,6 @@ public:
>>>>>>>          if (!TC.isKeyword())
>>>>>>>            R.addDecl(TC.getCorrectionDecl());
>>>>>>>          NE = SemaRef.BuildDeclarationNameExpr(CXXScopeSpec(), R,
>>>>>>> false);
>>>>>>> -        return TransformCache[E] =
>>>>>>> -                   SemaRef.BuildDeclarationNameExpr(CXXScopeSpec(),
>>>>>>> R, false);
>>>>>>>        }
>>>>>>>        assert(!NE.isUnset() &&
>>>>>>>               "Typo was transformed into a valid-but-null
>>>>>>> ExprResult");
>>>>>>>
>>>>>>> So the "if (!NE.isUnset())" is intended to be the only early return
>>>>>>> within the loop, though it still does not fail as an assert within the
>>>>>>> patchset.)
>>>>>>>
>>>>>>>>
>>>>>>>> Without that wrinkle, I think I've followed the code enough to sign
>>>>>>>> off on it. Please commit.
>>>>>>>>
>>>>>>>> [I do find the way the algorithm visits the corrections to be a
>>>>>>>> little hard to follow, in that its split between the TypoExprs cache,
>>>>>>>> consumer state, TransformTypoExpr, and
>>>>>>>> CheckAndAdvanceTypoExprCorrectionStreams - in some very abstract 
>>>>>>>> theory I
>>>>>>>> would like having each sequence of typo candidates (the Consumers 
>>>>>>>> represent
>>>>>>>> this) expose an iterator pair, then a generic utility to do 
>>>>>>>> combinations of
>>>>>>>> a list of ranges and it would expose an iterator over those 
>>>>>>>> combinations
>>>>>>>> which you could just increment to get the next one - it'd separate out 
>>>>>>>> the
>>>>>>>> combination logic into a reusable component that could be easily
>>>>>>>> examined/tested, then composed with the typo correction stuff to keep 
>>>>>>>> the
>>>>>>>> logic there relatively simple/clearer.
>>>>>>>>
>>>>>>>> I might try to work up an example of what I mean]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> - David
>>>>>>>>>>
>>>>>>>>>> (micro review comment I saw along the way: I /think/ 
>>>>>>>>>> CheckAndAdvanceTypoExprCorrectionStreams
>>>>>>>>>> could be simplified as follows:
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You're right, it can be simplified now that the loop is in its own
>>>>>>>>> function. Changed.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   bool CheckAndAdvanceTypoExprCorrectionStreams() {
>>>>>>>>>>     for (auto TE : TypoExprs) {
>>>>>>>>>>       TransformCache.erase(TE);
>>>>>>>>>>       auto &State = SemaRef.getTypoExprState(TE);
>>>>>>>>>>       if (!State.Consumer->finished())
>>>>>>>>>>         return true;
>>>>>>>>>>       State.Consumer->resetCorrectionStream();
>>>>>>>>>>     }
>>>>>>>>>>     return false;
>>>>>>>>>>   }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> (I'm sort of wondering if this might be better expressed by
>>>>>>>>>>>>> some iterator scheme, but I'm not sure - there's no pretty 
>>>>>>>>>>>>> next_combination
>>>>>>>>>>>>> algorithm (someone proposed one to boost at some point, it seems, 
>>>>>>>>>>>>> but
>>>>>>>>>>>>> that's about as close as it's got to anything common/generic) 
>>>>>>>>>>>>> anyway)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So for 3 TypoExprs... let's say 1 has the candidates (A, B,
>>>>>>>>>>>>>>>> C), 2 has the candidates (D, E, F, G) and 3 has the candidates 
>>>>>>>>>>>>>>>> (H, I) the
>>>>>>>>>>>>>>>> iterations would test the combinations in order:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> A D H
>>>>>>>>>>>>>>>> B D H
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I suppose a simpler form of my above example is: if, on the
>>>>>>>>>>>>>>> first pass, we get ADH, on the next pass the code as-is would 
>>>>>>>>>>>>>>> result in
>>>>>>>>>>>>>>> BCI, wouldn't it? It'll "next" each of the corrections.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (I think you meant BEI, and C is only a candidate of the
>>>>>>>>>>>>>> first TypoExpr, not the second one.) No it won't because of the 
>>>>>>>>>>>>>> caching.
>>>>>>>>>>>>>> The second TypoExpr is only effectively "next"ed once all of the 
>>>>>>>>>>>>>> candidates
>>>>>>>>>>>>>> for the first TypoExpr have been tried with the current 
>>>>>>>>>>>>>> candidate of the
>>>>>>>>>>>>>> second TypoExpr, and the third TypoExpr would only be "next"ed 
>>>>>>>>>>>>>> if all of
>>>>>>>>>>>>>> the candidates of both the first and second TypoExpr have been 
>>>>>>>>>>>>>> tried with
>>>>>>>>>>>>>> the current candidate of the third TypoExpr.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And I'm still not quite understanding why there's the  E !=
>>>>>>>>>>>>>>> TypoExprs[0] condition... maybe that's part of what I'm 
>>>>>>>>>>>>>>> missing. It looks
>>>>>>>>>>>>>>> like that code would cause, in a single Transform, multiple 
>>>>>>>>>>>>>>> instances of
>>>>>>>>>>>>>>> the first typo to be computed with different/subsequent typo 
>>>>>>>>>>>>>>> results...
>>>>>>>>>>>>>>> which seems confusing/strange. (though I don't fully understand 
>>>>>>>>>>>>>>> how/where
>>>>>>>>>>>>>>> multiple instances of the same TypoExpr will be visited in a 
>>>>>>>>>>>>>>> single
>>>>>>>>>>>>>>> Transform)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That condition ensures that the cached result is never used
>>>>>>>>>>>>>> for the first TypoExpr (instead of having an extra conditional 
>>>>>>>>>>>>>> for all of
>>>>>>>>>>>>>> the return statements below here to not cache the result for the 
>>>>>>>>>>>>>> first
>>>>>>>>>>>>>> TypoExpr).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (some nitpicks of the code - there's a duplicate lookup of
>>>>>>>>>>>>>>> TransformCache (count then [], or count then insert further 
>>>>>>>>>>>>>>> down the
>>>>>>>>>>>>>>> algorithm - it should probably just be:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ExprResult &CacheEntry = TransformCache[E];
>>>>>>>>>>>>>>> if (CacheEntry)
>>>>>>>>>>>>>>>   return CacheEntry;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This wouldn't work because "TransformCache[E]" would add a
>>>>>>>>>>>>>> cache entry for E, and either the "if (CacheEntry)" would always 
>>>>>>>>>>>>>> be true,
>>>>>>>>>>>>>> or an ExprError() would be considered false and an ExprError 
>>>>>>>>>>>>>> cache entry
>>>>>>>>>>>>>> would always be ignored, both of which would be bad.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> you're right that ExprResult isn't boolean testable - but it
>>>>>>>>>>>>> is possible to distinguish ExprError from a default-constructed 
>>>>>>>>>>>>> ExprResult.
>>>>>>>>>>>>> A default constructed ExprResult is in the valid-but-unset state, 
>>>>>>>>>>>>> so far as
>>>>>>>>>>>>> I can tell, so it would look like:
>>>>>>>>>>>>>
>>>>>>>>>>>>> if (!CacheEntry.isUnset())
>>>>>>>>>>>>>   return CacheEntry
>>>>>>>>>>>>>
>>>>>>>>>>>>> (which is, admittedly, not the /most/ legible thing ever)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I'm worried this won't work, or at least would be (potentially)
>>>>>>>>>>>> fragile, because .isUnset() is defined as "valid-but-null" and 
>>>>>>>>>>>> Sema sets
>>>>>>>>>>>> valid-but-null as an explicit state in the return values of various
>>>>>>>>>>>> functions--at least a few of which I saw while looking at users of 
>>>>>>>>>>>> the typo
>>>>>>>>>>>> correction code.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Ooh... I could totally see that happening (& thanks for the
>>>>>>>>>>> explanation).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Of course, a new local variable could be created containing
>>>>>>>>>>>>>> the result of TransformCache.find(E) and then compared to
>>>>>>>>>>>>>> TransformCache.end() and returning as appropriate, but I find it 
>>>>>>>>>>>>>> to be more
>>>>>>>>>>>>>> verbose and less clear:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> auto CacheEntry = TransformCache.find(E);
>>>>>>>>>>>>>> if (!TypoExprs.insert(E) && CacheEntry !=
>>>>>>>>>>>>>> TransformCache.end())
>>>>>>>>>>>>>>   return CacheEntry->second;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And neither of the subsequent return statements would be
>>>>>>>>>>>>>> changed because the cache entry would still need to be created...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you were using the iterator version, they could be hinted
>>>>>>>>>>>>> insertions which might be better - but I agree, if you're just 
>>>>>>>>>>>>> using the
>>>>>>>>>>>>> iterators it's probably not terribly helpful. But as shown above, 
>>>>>>>>>>>>> you can
>>>>>>>>>>>>> actually just use the op[] at the start, check !isUnset, then 
>>>>>>>>>>>>> initialize
>>>>>>>>>>>>> the elements. This is a fairly common idiom in map related code, 
>>>>>>>>>>>>> though
>>>>>>>>>>>>> more often it looks like
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> After a bit more consideration, I've taken your suggested
>>>>>>>>>>>> approach of using .isUnset() and added an assertion on trying to 
>>>>>>>>>>>> cache an
>>>>>>>>>>>> "unset" value ("assert(!NE.isUnset() && ...);").
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Sounds like a fair tradeoff - if the assertion does fire we can
>>>>>>>>>>> add a test case for it & switch back to something like your original
>>>>>>>>>>> formulation.
>>>>>>>>>>>
>>>>>>>>>>> The other gotcha to be aware of is that since it's a reference
>>>>>>>>>>> into a DenseMap are invalidated by insertions - so if any of the 
>>>>>>>>>>> work to
>>>>>>>>>>> initialize the TransformCache (the while nextCorrection loop, and 
>>>>>>>>>>> probably
>>>>>>>>>>> most importantly the "BuildDeclarationNameExpr" call) can also 
>>>>>>>>>>> modify the
>>>>>>>>>>> TransformCache (by transforming another typo expr) then the 
>>>>>>>>>>> reference will
>>>>>>>>>>> be invalidated and the approach I'd suggested won't be applicable 
>>>>>>>>>>> (you
>>>>>>>>>>> could still keep just one lookup in the cache hit case, but in the 
>>>>>>>>>>> cache
>>>>>>>>>>> miss you'd have to do a second lookup after the work)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>     return CacheEntry = NE;
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>> return CacheEntry = ExprError();
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> C D H
>>>>>>>>>>>>>>>> A E H
>>>>>>>>>>>>>>>> B E H
>>>>>>>>>>>>>>>> C E H
>>>>>>>>>>>>>>>> A F H
>>>>>>>>>>>>>>>> B F H
>>>>>>>>>>>>>>>> C F H
>>>>>>>>>>>>>>>> A G H
>>>>>>>>>>>>>>>> B G H
>>>>>>>>>>>>>>>> C G H
>>>>>>>>>>>>>>>> A D I
>>>>>>>>>>>>>>>> B D I
>>>>>>>>>>>>>>>> C D I
>>>>>>>>>>>>>>>> A E I
>>>>>>>>>>>>>>>> B E I
>>>>>>>>>>>>>>>> C E I
>>>>>>>>>>>>>>>> A F I
>>>>>>>>>>>>>>>> B F I
>>>>>>>>>>>>>>>> C F I
>>>>>>>>>>>>>>>> A G I
>>>>>>>>>>>>>>>> B G I
>>>>>>>>>>>>>>>> C G I
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If we get to the end, have invalid code, and the first typo 
>>>>>>>>>>>>>>>>> has no alternatives left to try - I guess we just want to 
>>>>>>>>>>>>>>>>> print errors for everything, without typo corrections?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If all of the combinations of typo corrections failed to
>>>>>>>>>>>>>>>> create an acceptable Expr, then the current correction for all 
>>>>>>>>>>>>>>>> of TypoExprs
>>>>>>>>>>>>>>>> would be an empty TypoCorrection which would cause the 
>>>>>>>>>>>>>>>> corresponding
>>>>>>>>>>>>>>>> diagnostic handlers to print out errors without suggestions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It might be helpful to split out some of these chunks (the 
>>>>>>>>>>>>>>>>> state resetting code and maybe the diagnostic printing at the 
>>>>>>>>>>>>>>>>> end) into separate functions to give them logical names & 
>>>>>>>>>>>>>>>>> make the higher level algorithm more clear?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've split the diagnostics emission and the state handling
>>>>>>>>>>>>>>>> into separate helpers, and expanded their description 
>>>>>>>>>>>>>>>> comments. I'm not
>>>>>>>>>>>>>>>> keen on the name for the state handler but it is the best name 
>>>>>>>>>>>>>>>> I could come
>>>>>>>>>>>>>>>> up with that was even remotely descriptive.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Roughly works, though I'm still quite confused, it seems.
>>>>>>>>>>>>>>> (alternative name for that function would be "reset" rather 
>>>>>>>>>>>>>>> than "advance"
>>>>>>>>>>>>>>> - but once I understand the algorithm better I might have better
>>>>>>>>>>>>>>> suggestions)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The function removes the cached result for the first TypoExpr
>>>>>>>>>>>>>> then checks the TypoExpr's correction candidate stream; if it 
>>>>>>>>>>>>>> has reached
>>>>>>>>>>>>>> the end of the stream, it will reset the stream then perform the 
>>>>>>>>>>>>>> same check
>>>>>>>>>>>>>> for the second TypoExpr, continuing until it either has reset the
>>>>>>>>>>>>>> correction streams for all of the TypoExprs or has found a 
>>>>>>>>>>>>>> TypoExpr that
>>>>>>>>>>>>>> hadn't reached the end of its correction stream. And in the 
>>>>>>>>>>>>>> former case, it
>>>>>>>>>>>>>> would signal that all combinations of corrections had been 
>>>>>>>>>>>>>> retried and the
>>>>>>>>>>>>>> TreeTransform should stop searching for a valid combination of 
>>>>>>>>>>>>>> corrections.
>>>>>>>>>>>>>> And now that I think about it, the "E != TypoExprs[0]" is 
>>>>>>>>>>>>>> probably now a
>>>>>>>>>>>>>> superfluous check since calling 
>>>>>>>>>>>>>> CheckAndAdvanceTypoExprCorrectionStreams
>>>>>>>>>>>>>> would always remove the first TypoExpr from the result cache.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The loop at the end ("for (auto E : TypoExprs) {") is over a 
>>>>>>>>>>>>>> hashing data structure, right? Is that going to cause unstable 
>>>>>>>>>>>>>> output of diagnostics? (eg: different machines, different memory 
>>>>>>>>>>>>>> layout, the same program might get the typos printed in a 
>>>>>>>>>>>>>> different order) or is there something else that will fix that?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If you have to change the set to be a SetVector, or similar 
>>>>>>>>>>>>>>>>> (to stabilize output) - then you won't need the FirstTypoExpr 
>>>>>>>>>>>>>>>>> anymore - it'll just be the first one in the set, since you 
>>>>>>>>>>>>>>>>> iterate in insertion order at that point.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Switched it from a SmallPtrSet to a SmallSetVector like it
>>>>>>>>>>>>>>>> should have been. Doing so also allowed me to simplify the 
>>>>>>>>>>>>>>>> state handling
>>>>>>>>>>>>>>>> code by being able to assume a deterministic order, and 
>>>>>>>>>>>>>>>> eliminating
>>>>>>>>>>>>>>>> FirstTypoExpr.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> "while (TypoCorrection TC = 
>>>>>>>>>>>>>>>>> State.Consumer->getNextCorrection()) {" doesn't seem to be a 
>>>>>>>>>>>>>>>>> loop - it has an unconditional return at the end of the body. 
>>>>>>>>>>>>>>>>> Should it just be an 'if'?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> There was actually a missing check on whether the result
>>>>>>>>>>>>>>>> of Sema::BuildDeclarationNameExpr was valid.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Were there missing/added test cases to cover this situation?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No because at this point in the patch series, there is
>>>>>>>>>>>>>> nothing that actually generates TypoExprs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> "+ (FullExpr.get()->isTypeDependent() ||
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +       FullExpr.get()->isValueDependent() ||
>>>>>>>>>>>>>>>>> +       FullExpr.get()->isInstantiationDependent())) {"
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What are those checks for? I don't know what condition we 
>>>>>>>>>>>>>>>>> would be in if we had typo corrections but it wasn't type, 
>>>>>>>>>>>>>>>>> value, or instantiation dependent. Perhaps a comment could 
>>>>>>>>>>>>>>>>> explain that?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The check is to avoid running the tree transform on Expr
>>>>>>>>>>>>>>>> trees that are known to not have a TypoExpr in them, 
>>>>>>>>>>>>>>>> regardless of whether
>>>>>>>>>>>>>>>> the current expression evaluation context still indicates 
>>>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>>>> uncorrected typos. I've added a comment to that effect.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hmm - I suppose my question might then be: When would we
>>>>>>>>>>>>>>> have NumTypos > 0 but be in a situation where the full 
>>>>>>>>>>>>>>> expression cannot
>>>>>>>>>>>>>>> have any typos in it?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't remember off the top of my head because it's been a
>>>>>>>>>>>>>> few months now since I wrote the original patches... but 
>>>>>>>>>>>>>> basically any
>>>>>>>>>>>>>> situation where more than one FullExpr shares an expression 
>>>>>>>>>>>>>> evaluation
>>>>>>>>>>>>>> context or when the current expression evaluation context had 
>>>>>>>>>>>>>> its NumTypos
>>>>>>>>>>>>>> increased by an expression evaluation context that was popped 
>>>>>>>>>>>>>> off the
>>>>>>>>>>>>>> stack. And TreeTransforms aren't cheap enough to blindly run on 
>>>>>>>>>>>>>> every Expr
>>>>>>>>>>>>>> when NumTypos is non-zero.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To simplify Sema::clearDelayedTypo you could add 
>>>>>>>>>>>>>>>> MapVector::erase(const KeyType&) (given that std::map has such 
>>>>>>>>>>>>>>>> an operation, it seems like a reasonable one for MapVector to 
>>>>>>>>>>>>>>>> have).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Done (simple patch attached).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Great - if you'd like to commit this separately, it should
>>>>>>>>>>>>>>> have a unit test. Otherwise it's probably simple enough to 
>>>>>>>>>>>>>>> commit along
>>>>>>>>>>>>>>> with your new usage if you prefer.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Committed with unit test as r219240.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sema::getTypoExprState looks like it could be a const
>>>>>>>>>>>>>>>>> function (it looks like it's never meant to be called when 
>>>>>>>>>>>>>>>>> the element
>>>>>>>>>>>>>>>>> isn't already in the map) - using 'lookup' instead of 
>>>>>>>>>>>>>>>>> operator[] will be a
>>>>>>>>>>>>>>>>> const operation on the map (& assert if the element doesn't 
>>>>>>>>>>>>>>>>> exist in the
>>>>>>>>>>>>>>>>> map).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sadly I cannot do that because the operator[] cannot be
>>>>>>>>>>>>>>>> used if the method is made const, and .lookup() can't be used 
>>>>>>>>>>>>>>>> because it
>>>>>>>>>>>>>>>> returns by-value instead of by-reference and TypoExprState is 
>>>>>>>>>>>>>>>> not copyable:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ../tools/clang/lib/Sema/SemaLookup.cpp:4600:10: warning:
>>>>>>>>>>>>>>>> returning reference to local temporary object 
>>>>>>>>>>>>>>>> [-Wreturn-stack-address]
>>>>>>>>>>>>>>>>   return DelayedTypos.lookup(TE);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ../include/llvm/ADT/MapVector.h:88:30: error: call to
>>>>>>>>>>>>>>>> implicitly-deleted copy constructor of 'const 
>>>>>>>>>>>>>>>> clang::Sema::TypoExprState'
>>>>>>>>>>>>>>>>     return Pos == Map.end()? ValueT() :
>>>>>>>>>>>>>>>> Vector[Pos->second].second;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Oh, weird - my mistake in assuming/misremembering what
>>>>>>>>>>>>>>> lookup actually does. I'd still make getTypoExprState const and 
>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>> find+assert just to constrain the interface a bit more rather 
>>>>>>>>>>>>>>> than leaving
>>>>>>>>>>>>>>> it possible to accidentally create a new entry in the map when 
>>>>>>>>>>>>>>> the API is
>>>>>>>>>>>>>>> only intended to query existing entries.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Done. The new changes I've made overall are:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/include/clang/Sema/Sema.h
>>>>>>>>>>>>>> b/include/clang/Sema/Sema.h
>>>>>>>>>>>>>> index b30bf5e..f644fdd 100644
>>>>>>>>>>>>>> --- a/include/clang/Sema/Sema.h
>>>>>>>>>>>>>> +++ b/include/clang/Sema/Sema.h
>>>>>>>>>>>>>> @@ -2628,7 +2628,7 @@ private:
>>>>>>>>>>>>>>                               bool ErrorRecovery, bool
>>>>>>>>>>>>>> &IsUnqualifiedLookup);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  public:
>>>>>>>>>>>>>> -  const TypoExprState &getTypoExprState(TypoExpr *TE);
>>>>>>>>>>>>>> +  const TypoExprState &getTypoExprState(TypoExpr *TE) const;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    /// \brief Clears the state of the given TypoExpr.
>>>>>>>>>>>>>>    void clearDelayedTypo(TypoExpr *TE);
>>>>>>>>>>>>>> diff --git a/lib/Sema/SemaExprCXX.cpp
>>>>>>>>>>>>>> b/lib/Sema/SemaExprCXX.cpp
>>>>>>>>>>>>>> index 73aa997..1f08276 100644
>>>>>>>>>>>>>> --- a/lib/Sema/SemaExprCXX.cpp
>>>>>>>>>>>>>> +++ b/lib/Sema/SemaExprCXX.cpp
>>>>>>>>>>>>>> @@ -6028,7 +6028,7 @@ public:
>>>>>>>>>>>>>>      // If the TypoExpr hasn't been seen before, record it.
>>>>>>>>>>>>>> Otherwise, return the
>>>>>>>>>>>>>>      // cached transformation result if there is one and the
>>>>>>>>>>>>>> TypoExpr isn't the
>>>>>>>>>>>>>>      // first one that was encountered.
>>>>>>>>>>>>>> -    if (!TypoExprs.insert(E) && E != TypoExprs[0] &&
>>>>>>>>>>>>>> TransformCache.count(E)) {
>>>>>>>>>>>>>> +    if (!TypoExprs.insert(E) && TransformCache.count(E)) {
>>>>>>>>>>>>>>        return TransformCache[E];
>>>>>>>>>>>>>>      }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/lib/Sema/SemaLookup.cpp b/lib/Sema/SemaLookup.cpp
>>>>>>>>>>>>>> index 8ddb5fd..bb6c76a 100644
>>>>>>>>>>>>>> --- a/lib/Sema/SemaLookup.cpp
>>>>>>>>>>>>>> +++ b/lib/Sema/SemaLookup.cpp
>>>>>>>>>>>>>> @@ -4601,8 +4601,11 @@ TypoExpr
>>>>>>>>>>>>>> *Sema::createDelayedTypo(std::unique_ptr<TypoCorrectionConsumer> 
>>>>>>>>>>>>>> TCC,
>>>>>>>>>>>>>>    return TE;
>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -const Sema::TypoExprState &Sema::getTypoExprState(TypoExpr
>>>>>>>>>>>>>> *TE) {
>>>>>>>>>>>>>> -  return DelayedTypos[TE];
>>>>>>>>>>>>>> +const Sema::TypoExprState &Sema::getTypoExprState(TypoExpr
>>>>>>>>>>>>>> *TE) const {
>>>>>>>>>>>>>> +  auto Entry = DelayedTypos.find(TE);
>>>>>>>>>>>>>> +  assert(Entry != DelayedTypos.end() &&
>>>>>>>>>>>>>> +         "Failed to get the state for a TypoExpr!");
>>>>>>>>>>>>>> +  return Entry->second;
>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  void Sema::clearDelayedTypo(TypoExpr *TE) {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Otherwise this all looks like rather neat stuff - though
>>>>>>>>>>>>>>>>> I'm not as able to review in detail some of the refactoring 
>>>>>>>>>>>>>>>>> of existing
>>>>>>>>>>>>>>>>> typo correction stuff. It looks plausible.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I've attached the updated patch. Thanks for reviewing this,
>>>>>>>>>>>>>>>> David!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Sep 25, 2014 at 2:00 PM, Kaelyn Takata <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ping
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Aug 26, 2014 at 11:05 AM, Kaelyn Takata <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> +dblaikie
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Jul 31, 2014 at 1:20 PM, Kaelyn Takata <
>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Part of the infrastructure is a map from a TypoExpr to
>>>>>>>>>>>>>>>>>>>> the Sema-specific
>>>>>>>>>>>>>>>>>>>> state needed to correct it, along with helpers to ease
>>>>>>>>>>>>>>>>>>>> dealing with the
>>>>>>>>>>>>>>>>>>>> state.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The the typo count is propagated up the stack of
>>>>>>>>>>>>>>>>>>>> ExpressionEvaluationContextRecords when one is popped
>>>>>>>>>>>>>>>>>>>> off of to
>>>>>>>>>>>>>>>>>>>> avoid accidentally dropping TypoExprs on the floor. For
>>>>>>>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>>>>>>>> the attempted correction of g() in
>>>>>>>>>>>>>>>>>>>> test/CXX/class/class.mem/p5-0x.cpp
>>>>>>>>>>>>>>>>>>>> happens with an ExpressionEvaluationContextRecord that
>>>>>>>>>>>>>>>>>>>> is popped off
>>>>>>>>>>>>>>>>>>>> the stack prior to ActOnFinishFullExpr being called and
>>>>>>>>>>>>>>>>>>>> the tree
>>>>>>>>>>>>>>>>>>>> transform for TypoExprs being run.
>>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>>  include/clang/Sema/Sema.h           |  44 +++++
>>>>>>>>>>>>>>>>>>>>  include/clang/Sema/SemaInternal.h   |  15 +-
>>>>>>>>>>>>>>>>>>>>  include/clang/Sema/TypoCorrection.h |   2 +-
>>>>>>>>>>>>>>>>>>>>  lib/Sema/SemaExpr.cpp               |   7 +
>>>>>>>>>>>>>>>>>>>>  lib/Sema/SemaExprCXX.cpp            | 108 ++++++++++++
>>>>>>>>>>>>>>>>>>>>  lib/Sema/SemaLookup.cpp             | 316
>>>>>>>>>>>>>>>>>>>> ++++++++++++++++++++++++------------
>>>>>>>>>>>>>>>>>>>>  6 files changed, 384 insertions(+), 108 deletions(-)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH v4 5/8] Start adding the infrastructure for handling TypoExprs.

Reply via email to