On Tue, Oct 21, 2014 at 1:37 PM, Kaelyn Takata <[email protected]> wrote:
> > > On Fri, Oct 17, 2014 at 9:52 AM, David Blaikie <[email protected]> wrote: > >> >> >> On Thu, Oct 9, 2014 at 10:12 AM, David Blaikie <[email protected]> >> wrote: >> >>> >>> >>> On Thu, Oct 9, 2014 at 9:58 AM, Kaelyn Takata <[email protected]> wrote: >>> >>>> >>>> >>>> On Tue, Oct 7, 2014 at 5:03 PM, David Blaikie <[email protected]> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Oct 7, 2014 at 2:29 PM, Kaelyn Takata <[email protected]> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Oct 7, 2014 at 11:43 AM, David Blaikie <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Oct 6, 2014 at 3:46 PM, Kaelyn Takata <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> On Fri, Oct 3, 2014 at 11:45 AM, David Blaikie <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Given that these are parts of a patch series, some of which have >>>>>>>>> already been signed off on/I haven't looked at, it's a bit hard to >>>>>>>>> review >>>>>>>>> the overall design - but I believe you & Richard talked that over, so >>>>>>>>> I'm >>>>>>>>> not too worried about that (just explaining why my review feedback's >>>>>>>>> going >>>>>>>>> to be a bit narrow) >>>>>>>>> >>>>>>>>> + if (NumTypos > 0 && !ExprEvalContexts.empty()) >>>>>>>>> + ExprEvalContexts.back().NumTypos += NumTypos; >>>>>>>>> + else >>>>>>>>> + assert(NumTypos == 0 && "There are outstanding typos after >>>>>>>>> popping the " >>>>>>>>> + "last >>>>>>>>> ExpressionEvaluationContextRecord"); >>>>>>>>> >>>>>>>>> Could this be simplified? I think it's equivalent to: >>>>>>>>> >>>>>>>>> if (!ExprEvalContexts.empty()) >>>>>>>>> ExprEvalContexts.back().NumTypos += NumTypos; >>>>>>>>> else >>>>>>>>> assert(NumTypos == 0 ... ); >>>>>>>>> >>>>>>>>> or: >>>>>>>>> >>>>>>>>> assert(ExprEvalContexts.empty() || NumTypos != 0 && ...) >>>>>>>>> if (!ExprEvalContexts.empty()) >>>>>>>>> ... >>>>>>>>> >>>>>>>>> Yes, the "NumTypos > 0" was unnecessary and has been removed. >>>>>>>> (FYI, the second form you suggested doesn't quite work because the >>>>>>>> assert >>>>>>>> would fail on the common case where ExprEvalContexts is not empty and >>>>>>>> there >>>>>>>> are no typos.) >>>>>>>> >>>>>>> >>>>>>> Hmm, right, maybe: >>>>>>> >>>>>>> assert(ExprEvalContexts.empty() ^ NumTypos != 0 && ... ) >>>>>>> if (!ExprEvalContexts.empty()) >>>>>>> ExprEvalContexts.back().NumTypos += NumTypos; >>>>>>> >>>>>>> but perhaps that's just too confusing. I just find an assertion as >>>>>>> the only thing inside a block of code to be a bit off... but it's >>>>>>> certainly >>>>>>> not unprecedented in LLVM and is probably more legible - just my own >>>>>>> hangup >>>>>>> to get over :) >>>>>>> >>>>>>> >>>>>>>> There's an unnecessary semicolon at the end of the "while (true)" loop >>>>>>>> in TransformTypos::Transform >>>>>>>>> >>>>>>>>> Removed. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I'd roll this up a bit: >>>>>>>>> >>>>>>>>> >>>>>>>>> + auto &State = SemaRef.getTypoExprState(TE); >>>>>>>>> + if (State.DiagHandler) >>>>>>>>> + (*State.DiagHandler)(State.Consumer->getCurrentCorrection()); >>>>>>>>> >>>>>>>>> into: >>>>>>>>> >>>>>>>>> if (auto *Handler = SemaRef.getTypoExprState(TE).DiagHandler) >>>>>>>>> (*DiagHandler)(State.Consumer->getCurrentCorrection()); >>>>>>>>> >>>>>>>>> That rollup doesn't work because State would be unknown when >>>>>>>> accessing State.Consumer in the argument to the DiagHandler. >>>>>>>> >>>>>>> >>>>>>> Oh, right! >>>>>>> >>>>>>>> count in the condition, plus insert in the body of the if in >>>>>>>> TransformTypos::TransformTypoExpr would be more efficient as a single >>>>>>>> insert-and-check (by avoiding two separate lookups into the set): >>>>>>>>> if (TypoExprs.insert(E).second) // means this element previously >>>>>>>>> didn't exist in the set, and now does >>>>>>>>> >>>>>>>>> I've changed the count-then-insert to just an insert since >>>>>>>> SmallPtrSetImpl::insert returns true when the item wasn't already in >>>>>>>> the >>>>>>>> set. >>>>>>>> >>>>>>>>> This: >>>>>>>>> >>>>>>>>> TransformCache.find(E) != TransformCache.end() >>>>>>>>> >>>>>>>>> might commonly be written as: >>>>>>>>> >>>>>>>>> TransformCache.count(E) >>>>>>>>> >>>>>>>>> Optionally with "!= 0", but often count is used directly as a boolean >>>>>>>>> test, especially when it's a single set/map (not a multi set/map). >>>>>>>>> >>>>>>>>> Changed. >>>>>>>> >>>>>>>>> This comment didn't quite make sense to me: >>>>>>>>> >>>>>>>>> >>>>>>>>> + // If corrections for the first TypoExpr have been exhausted, >>>>>>>>> retry them >>>>>>>>> + // against the next combination of substitutions for all of >>>>>>>>> the other >>>>>>>>> + // TypoExprs. >>>>>>>>> >>>>>>>>> When you say "retry them" which them do you mean? The corrections for >>>>>>>>> the first TypoExpr? But what does it mean for them to be exhausted >>>>>>>>> then, if they can still be considered? >>>>>>>>> >>>>>>>>> Maybe I need a bit of help with the higher level algorithm here - I >>>>>>>>> assume the idea is: >>>>>>>>> >>>>>>>>> Call transform. >>>>>>>>> As it visits TypoExpr nodes, it tries their first typo candidate >>>>>>>>> available. >>>>>>>>> If we get to the end and have invalid code, reset the state of all >>>>>>>>> the typo corrections except the first one (on its next query, it'll >>>>>>>>> give back its second alternative). >>>>>>>>> >>>>>>>>> Here is where the confusion is at. As the transform visits each >>>>>>>> TypoExpr node it tries the first candidate available. If the first >>>>>>>> candidates of all the TypoExprs don't work, it then tries the next (and >>>>>>>> subsequent) candidates of only the first TypoExpr that was seen. When >>>>>>>> it >>>>>>>> exhausts the candidates from the first TypoExpr, it then resets the >>>>>>>> stream >>>>>>>> of candidates for the first TypoExpr and fetches the next candidate of >>>>>>>> the >>>>>>>> TypoExpr, continuing in this way until all combinations of candidates >>>>>>>> of >>>>>>>> all TypoExprs have been tried. >>>>>>>> >>>>>>> >>>>>>> So I'm still a bit confused about which parts of the retry logic are >>>>>>> in which parts of the code. >>>>>>> >>>>>>> OK, I think I've got it. Transform is called and we don't even know >>>>>>> which TypoExprs are in the Expr. So we transform. >>>>>>> >>>>>>> As transform runs it comes across TypoExprs, and attempts to >>>>>>> transform them. For each one, it tries candidates until it finds a good >>>>>>> one >>>>>>> (based on the immediate context - what are the cases where >>>>>>> BuildDeclarationNameExpr fails here? (when does NE.isInvalid fail?)). >>>>>>> >>>>>> >>>>>> I think there is a bit of confusion on what a "retry" is. Within >>>>>> TransformTypoExpr, for either the first TypoExpr ever encountered by this >>>>>> TreeTransform or for a TypoExpr without a cached substitution, it tries >>>>>> to >>>>>> build a valid subexpression from the TypoCorrection candidates (e.g. a >>>>>> DeclRefExpr) and continues trying candidates until it finds one for >>>>>> which a >>>>>> valid subexpression can be formed, which it caches and returns. If none >>>>>> of >>>>>> the candidates worked, it will cache and return an invalid ExprResult. >>>>>> The >>>>>> TreeTransform then takes the returned subexpression and tries to rebuild >>>>>> the full expression (possibly trying to transform multiple TypoExprs >>>>>> along >>>>>> with CallExprs, etc). >>>>>> >>>>>> I don't remember off the top of my head without looking at the body, >>>>>> but I think BuildDeclarationNameExpr will fail in cases where the >>>>>> correction is e.g. a type name, or a member reference name--basically, >>>>>> BuildDeclarationNameExpr returns an ExprResult and the correct behavior >>>>>> in >>>>>> principle is to check for an invalid result and try the next correction >>>>>> candidate when one is seen--even if the Consumer never returns a >>>>>> candidate >>>>>> for which BuildDeclarationNameExpr would fail. >>>>>> >>>>>>> >>>>>>> So, in your example below, maybe the right answer is BEH, but AF is >>>>>>> valid, with no correct candidate for the 3 choice. >>>>>>> >>>>>> >>>>>> No, AF with no correction for the 3rd TypoExpr would result in the >>>>>> entire Expr result being invalid within the main Transform method. >>>>>> >>>>>> >>>>>>> Consumers start at "before the first" element, calling >>>>>>> "getNextCorrection" on a new consumer gives you the first correction. >>>>>>> >>>>>> >>>>>> That is correct. And the "before the first" element is also an empty >>>>>> correction for use with getCurrentCorrection. ;) >>>>>> >>>>>>> >>>>>>> On the first call to TransformExpr, the first call to >>>>>>> TransformTypoExpr chooses A, records it, continues. >>>>>>> Gets to the second TransformTypoExpr, tries D and E, succeeds at F. >>>>>>> >>>>>> >>>>>> it would only try D then E then F within a single call to >>>>>> TransformTypoExpr if BuildDeclarationNameExpr failed for D and E. >>>>>> >>>>>>> On the third TransformTypoExpr, it tries H and I, both fail, so it >>>>>>> returns from the last line ("return TransformCache[E] = ExprError]). >>>>>>> >>>>>> >>>>>> Same case here--that last line would only be reached if H and I both >>>>>> failed within the call to BuildDeclarationNameExpr. >>>>>> >>>>>>> >>>>>>> We get back to Transform with a failure. So we walk the TypoExprs >>>>>>> (in CheckAndAdvanceTypoExprCorrectionStreams) we reset the 3rd consumer >>>>>>> (because it was finished). >>>>>>> >>>>>> >>>>>> No, CheckAndAdvanceTypoExprCorrectionStreams would do nothing if the >>>>>> first TypoExpr is not finished. If the first TypoExpr is finished, then >>>>>> it >>>>>> would reset it and clear the cached result for the second TypoExpr (so >>>>>> that >>>>>> the next call to TransformTypoExpr on the second TypoExpr would try a new >>>>>> candidate instead of using the cached result). Only if both the first and >>>>>> second TypoExprs were *both* finished would the cached value of the third >>>>>> TypoExpr be cleared (and the candidate streams for both the first and >>>>>> second TypoExprs would be reset in addition to their cached results being >>>>>> removed from the cache). The only way >>>>>> CheckAndAdvanceTypoExprCorrectionStreams would see if the third TypoExpr >>>>>> was finished would be if both the first and second TypoExprs were also >>>>>> finished, and upon resetting it the tree transform would only be >>>>>> attempted >>>>>> again if there were at least four known TypoExprs (i.e. the number of >>>>>> known >>>>>> TypoExprs is greater than the number of TypoExprs that were finished and >>>>>> had to be reset by CheckAndAdvanceTypoExprCorrectionStreams). >>>>>> >>>>>>> >>>>>>> We transform again: >>>>>>> >>>>>>> First typo) We are at A so we call next and get B >>>>>>> >>>>>> >>>>>> Yes. The first TypoExpr would always have it's next candidate tried >>>>>> by a call to TransformTypoExpr. >>>>>> >>>>>> >>>>>>> Second typo) From F we continue and try G, fail, then we're finished >>>>>>> for the second typo. >>>>>>> >>>>>> >>>>>> No, G would only be tried if the result from transforming F was >>>>>> removed from the cache. >>>>>> >>>>>> >>>>>>> Third typo) Perhaps we find some solution for this, or not, it >>>>>>> doesn't matter. >>>>>>> >>>>>>> We get back to Transform with a failure (second typo), So we walk >>>>>>> the TypoExprs again, resetting the second (and maybe the third). >>>>>>> >>>>>> >>>>>> If there are only three TypoExprs, then resetting the third TypoExpr >>>>>> would mean there are no valid combinations of corrections that yield a >>>>>> valid expression (where the final result of the transform is neither an >>>>>> invalid ExprResult nor caused any errors to be reported by the >>>>>> SFINAETrap). >>>>>> >>>>>> >>>>> >>>>>>> First typo) We are at B so we call next and get to C, let's say we >>>>>>> fail here - C isn't even remotely valid. >>>>>>> Now it really matters whether we find/fail on the second and third >>>>>>> typo. If we fail on the second and third... we're going to give up. >>>>>>> We'll >>>>>>> find that all the corrections are finished and we'll conclude that we've >>>>>>> tried everything. >>>>>>> >>>>>>> Am I following this right? >>>>>>> >>>>>> >>>>>> Not quite. I think you are forgetting about the fact that, for all >>>>>> TypoExprs except the first, a cached result is used if it exists instead >>>>>> of >>>>>> fetching a new candidate from the Consumer and trying to build a valid >>>>>> Expr >>>>>> with it. >>>>>> >>>>> >>>>> Ah, looking at the code again I see what you mean. The 'break' is the >>>>> bit that I missed in the reset/advance loop. >>>>> >>>>> Which patches would I have to apply to ToT to be able to exercise this >>>>> code? It might help for me to step through some of these scenarios in a >>>>> debugger. >>>>> >>>> >>>> To exercise this code, you'd need the full patch series (all 8) to >>>> exercise this code because it's not until patch #8 that anything actually >>>> generates TypoExprs in the AST. I can send you a combined patch rebased on >>>> ToT off-list if you'd like (it's ~112KB uncompressed), and the new test >>>> cases in test/SemaCXX/typo-correction-delayed.cpp explicitly test this >>>> functionality. >>>> >>> >>> That'd be great! >>> >> >> With your total patch applied I've been able to reproduce a few typo >> correction scenarios - thanks! >> >> But I can't seem to reach/exercise the loop in >> TransformTypos::TransformTypoExpr - if I change the "if (!NE.isInvalid())" >> condition at the end of the loop (the only way to repeat the body of the >> loop) to "assert(!NE.isInvalid())" it doesn't fire on any of the test cases >> (nor on my own modified versions). Do you have some tests that cover this? >> Could you show/describe a simple situation where it occurs so I can play >> around with that part of the logic? >> > > I'm not sure it can be triggered with CorrectTypoDelayed only hooked up to > LookupMemberExprInRecord (a static function called by > Sema::BuildMemberReferenceExpr directly and indirectly via > LookupMemberExpr), but in the second patch set which hooks > CorrectTypoDelayed up to DiagnoseEmptyLookup, the assert is triggered by > the tests SemaCXX/type-definition-in-specifier.cpp > and SemaCXX/warn-thread-safety-parsing.cpp (in addition to the loop being > the correct/principled behavior since there is no guarantee that the > recovery callback will generate a valid ExprResult from a non-empty > TypoCorrection). For this patch set, to trigger the assert a code snippet > would have to correct a class member lookup and have a candidate that is a > member function, constant, variable, or template in the current class or > one of its base classes (per the RecordMemberExprValidatorCCC) yet which > yields causes MemberExprTypoRecovery to return an ExprError when trying to > build the member reference. Basically, it would require having the typo > correction try a candidate that is an accessible class member that cannot > be reference--something that I'm not sure is even possible in C++. > Hrm, OK - could you change it to an assertion (& remove the outer loop/replace it with a conditional) until that later patch where it becomes necessary, so it'll be easier to review it in the context of it being executed/used code? Without that wrinkle, I think I've followed the code enough to sign off on it. Please commit. [I do find the way the algorithm visits the corrections to be a little hard to follow, in that its split between the TypoExprs cache, consumer state, TransformTypoExpr, and CheckAndAdvanceTypoExprCorrectionStreams - in some very abstract theory I would like having each sequence of typo candidates (the Consumers represent this) expose an iterator pair, then a generic utility to do combinations of a list of ranges and it would expose an iterator over those combinations which you could just increment to get the next one - it'd separate out the combination logic into a reusable component that could be easily examined/tested, then composed with the typo correction stuff to keep the logic there relatively simple/clearer. I might try to work up an example of what I mean] > >> - David >> >> (micro review comment I saw along the way: I /think/ >> CheckAndAdvanceTypoExprCorrectionStreams >> could be simplified as follows: >> > > You're right, it can be simplified now that the loop is in its own > function. Changed. > >> >> bool CheckAndAdvanceTypoExprCorrectionStreams() { >> for (auto TE : TypoExprs) { >> TransformCache.erase(TE); >> auto &State = SemaRef.getTypoExprState(TE); >> if (!State.Consumer->finished()) >> return true; >> State.Consumer->resetCorrectionStream(); >> } >> return false; >> } >> >> >> >>> >>> >>>> >>>>> (I'm sort of wondering if this might be better expressed by some >>>>> iterator scheme, but I'm not sure - there's no pretty next_combination >>>>> algorithm (someone proposed one to boost at some point, it seems, but >>>>> that's about as close as it's got to anything common/generic) anyway) >>>>> >>>>> >>>>>> >>>>>>> >>>>>>>> So for 3 TypoExprs... let's say 1 has the candidates (A, B, C), 2 >>>>>>>> has the candidates (D, E, F, G) and 3 has the candidates (H, I) the >>>>>>>> iterations would test the combinations in order: >>>>>>>> >>>>>>>> A D H >>>>>>>> B D H >>>>>>>> >>>>>>> >>>>>>> I suppose a simpler form of my above example is: if, on the first >>>>>>> pass, we get ADH, on the next pass the code as-is would result in BCI, >>>>>>> wouldn't it? It'll "next" each of the corrections. >>>>>>> >>>>>> >>>>>> (I think you meant BEI, and C is only a candidate of the first >>>>>> TypoExpr, not the second one.) No it won't because of the caching. The >>>>>> second TypoExpr is only effectively "next"ed once all of the candidates >>>>>> for >>>>>> the first TypoExpr have been tried with the current candidate of the >>>>>> second >>>>>> TypoExpr, and the third TypoExpr would only be "next"ed if all of the >>>>>> candidates of both the first and second TypoExpr have been tried with the >>>>>> current candidate of the third TypoExpr. >>>>>> >>>>>>> >>>>>>> And I'm still not quite understanding why there's the E != >>>>>>> TypoExprs[0] condition... maybe that's part of what I'm missing. It >>>>>>> looks >>>>>>> like that code would cause, in a single Transform, multiple instances of >>>>>>> the first typo to be computed with different/subsequent typo results... >>>>>>> which seems confusing/strange. (though I don't fully understand >>>>>>> how/where >>>>>>> multiple instances of the same TypoExpr will be visited in a single >>>>>>> Transform) >>>>>>> >>>>>> >>>>>> That condition ensures that the cached result is never used for the >>>>>> first TypoExpr (instead of having an extra conditional for all of the >>>>>> return statements below here to not cache the result for the first >>>>>> TypoExpr). >>>>>> >>>>>>> >>>>>>> (some nitpicks of the code - there's a duplicate lookup of >>>>>>> TransformCache (count then [], or count then insert further down the >>>>>>> algorithm - it should probably just be: >>>>>>> >>>>>>> ExprResult &CacheEntry = TransformCache[E]; >>>>>>> if (CacheEntry) >>>>>>> return CacheEntry; >>>>>>> >>>>>> >>>>>> This wouldn't work because "TransformCache[E]" would add a cache >>>>>> entry for E, and either the "if (CacheEntry)" would always be true, or an >>>>>> ExprError() would be considered false and an ExprError cache entry would >>>>>> always be ignored, both of which would be bad. >>>>>> >>>>> >>>>> you're right that ExprResult isn't boolean testable - but it is >>>>> possible to distinguish ExprError from a default-constructed ExprResult. A >>>>> default constructed ExprResult is in the valid-but-unset state, so far as >>>>> I >>>>> can tell, so it would look like: >>>>> >>>>> if (!CacheEntry.isUnset()) >>>>> return CacheEntry >>>>> >>>>> (which is, admittedly, not the /most/ legible thing ever) >>>>> >>>> >>>> I'm worried this won't work, or at least would be (potentially) >>>> fragile, because .isUnset() is defined as "valid-but-null" and Sema sets >>>> valid-but-null as an explicit state in the return values of various >>>> functions--at least a few of which I saw while looking at users of the typo >>>> correction code. >>>> >>> >>> Ooh... I could totally see that happening (& thanks for the explanation). >>> >>> >>>> >>>>> >>>>>> Of course, a new local variable could be created containing the >>>>>> result of TransformCache.find(E) and then compared to >>>>>> TransformCache.end() >>>>>> and returning as appropriate, but I find it to be more verbose and less >>>>>> clear: >>>>>> >>>>>> auto CacheEntry = TransformCache.find(E); >>>>>> if (!TypoExprs.insert(E) && CacheEntry != TransformCache.end()) >>>>>> return CacheEntry->second; >>>>>> >>>>>> And neither of the subsequent return statements would be changed >>>>>> because the cache entry would still need to be created... >>>>>> >>>>> >>>>> If you were using the iterator version, they could be hinted >>>>> insertions which might be better - but I agree, if you're just using the >>>>> iterators it's probably not terribly helpful. But as shown above, you can >>>>> actually just use the op[] at the start, check !isUnset, then initialize >>>>> the elements. This is a fairly common idiom in map related code, though >>>>> more often it looks like >>>>> >>>> >>>> After a bit more consideration, I've taken your suggested approach of >>>> using .isUnset() and added an assertion on trying to cache an "unset" value >>>> ("assert(!NE.isUnset() && ...);"). >>>> >>> >>> Sounds like a fair tradeoff - if the assertion does fire we can add a >>> test case for it & switch back to something like your original formulation. >>> >>> The other gotcha to be aware of is that since it's a reference into a >>> DenseMap are invalidated by insertions - so if any of the work to >>> initialize the TransformCache (the while nextCorrection loop, and probably >>> most importantly the "BuildDeclarationNameExpr" call) can also modify the >>> TransformCache (by transforming another typo expr) then the reference will >>> be invalidated and the approach I'd suggested won't be applicable (you >>> could still keep just one lookup in the cache hit case, but in the cache >>> miss you'd have to do a second lookup after the work) >>> >>> >>>> >>>>> >>>>>> >>>>>> ... >>>>>>> return CacheEntry = NE; >>>>>>> } >>>>>>> return CacheEntry = ExprError(); >>>>>>> >>>>>>> ) >>>>>>> >>>>>>> >>>>>>> >>>>>>>> C D H >>>>>>>> A E H >>>>>>>> B E H >>>>>>>> C E H >>>>>>>> A F H >>>>>>>> B F H >>>>>>>> C F H >>>>>>>> A G H >>>>>>>> B G H >>>>>>>> C G H >>>>>>>> A D I >>>>>>>> B D I >>>>>>>> C D I >>>>>>>> A E I >>>>>>>> B E I >>>>>>>> C E I >>>>>>>> A F I >>>>>>>> B F I >>>>>>>> C F I >>>>>>>> A G I >>>>>>>> B G I >>>>>>>> C G I >>>>>>>> >>>>>>>>> If we get to the end, have invalid code, and the first typo has no >>>>>>>>> alternatives left to try - I guess we just want to print errors for >>>>>>>>> everything, without typo corrections? >>>>>>>>> >>>>>>>>> >>>>>>>> If all of the combinations of typo corrections failed to create an >>>>>>>> acceptable Expr, then the current correction for all of TypoExprs >>>>>>>> would be >>>>>>>> an empty TypoCorrection which would cause the corresponding diagnostic >>>>>>>> handlers to print out errors without suggestions. >>>>>>>> >>>>>>>>> It might be helpful to split out some of these chunks (the state >>>>>>>>> resetting code and maybe the diagnostic printing at the end) into >>>>>>>>> separate functions to give them logical names & make the higher level >>>>>>>>> algorithm more clear? >>>>>>>>> >>>>>>>>> I've split the diagnostics emission and the state handling into >>>>>>>> separate helpers, and expanded their description comments. I'm not >>>>>>>> keen on >>>>>>>> the name for the state handler but it is the best name I could come up >>>>>>>> with >>>>>>>> that was even remotely descriptive. >>>>>>>> >>>>>>> >>>>>>> Roughly works, though I'm still quite confused, it seems. >>>>>>> (alternative name for that function would be "reset" rather than >>>>>>> "advance" >>>>>>> - but once I understand the algorithm better I might have better >>>>>>> suggestions) >>>>>>> >>>>>> >>>>>> The function removes the cached result for the first TypoExpr then >>>>>> checks the TypoExpr's correction candidate stream; if it has reached the >>>>>> end of the stream, it will reset the stream then perform the same check >>>>>> for >>>>>> the second TypoExpr, continuing until it either has reset the correction >>>>>> streams for all of the TypoExprs or has found a TypoExpr that hadn't >>>>>> reached the end of its correction stream. And in the former case, it >>>>>> would >>>>>> signal that all combinations of corrections had been retried and the >>>>>> TreeTransform should stop searching for a valid combination of >>>>>> corrections. >>>>>> And now that I think about it, the "E != TypoExprs[0]" is probably now a >>>>>> superfluous check since calling CheckAndAdvanceTypoExprCorrectionStreams >>>>>> would always remove the first TypoExpr from the result cache. >>>>>> >>>>>> The loop at the end ("for (auto E : TypoExprs) {") is over a hashing >>>>>> data structure, right? Is that going to cause unstable output of >>>>>> diagnostics? (eg: different machines, different memory layout, the same >>>>>> program might get the typos printed in a different order) or is there >>>>>> something else that will fix that? >>>>>>>>> >>>>>>>>> >>>>>>>>> If you have to change the set to be a SetVector, or similar (to >>>>>>>>> stabilize output) - then you won't need the FirstTypoExpr anymore - >>>>>>>>> it'll just be the first one in the set, since you iterate in >>>>>>>>> insertion order at that point. >>>>>>>>> >>>>>>>>> Switched it from a SmallPtrSet to a SmallSetVector like it should >>>>>>>> have been. Doing so also allowed me to simplify the state handling >>>>>>>> code by >>>>>>>> being able to assume a deterministic order, and eliminating >>>>>>>> FirstTypoExpr. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> "while (TypoCorrection TC = State.Consumer->getNextCorrection()) {" >>>>>>>>> doesn't seem to be a loop - it has an unconditional return at the end >>>>>>>>> of the body. Should it just be an 'if'? >>>>>>>>> >>>>>>>>> There was actually a missing check on whether the result of >>>>>>>> Sema::BuildDeclarationNameExpr was valid. >>>>>>>> >>>>>>> >>>>>>> Were there missing/added test cases to cover this situation? >>>>>>> >>>>>> >>>>>> No because at this point in the patch series, there is nothing that >>>>>> actually generates TypoExprs. >>>>>> >>>>>>> "+ (FullExpr.get()->isTypeDependent() || >>>>>>>>> >>>>>>>>> + FullExpr.get()->isValueDependent() || >>>>>>>>> + FullExpr.get()->isInstantiationDependent())) {" >>>>>>>>> >>>>>>>>> What are those checks for? I don't know what condition we would be in >>>>>>>>> if we had typo corrections but it wasn't type, value, or >>>>>>>>> instantiation dependent. Perhaps a comment could explain that? >>>>>>>>> >>>>>>>>> The check is to avoid running the tree transform on Expr trees >>>>>>>> that are known to not have a TypoExpr in them, regardless of whether >>>>>>>> the >>>>>>>> current expression evaluation context still indicates there are >>>>>>>> uncorrected >>>>>>>> typos. I've added a comment to that effect. >>>>>>>> >>>>>>> >>>>>>> Hmm - I suppose my question might then be: When would we have >>>>>>> NumTypos > 0 but be in a situation where the full expression cannot have >>>>>>> any typos in it? >>>>>>> >>>>>> >>>>>> I don't remember off the top of my head because it's been a few >>>>>> months now since I wrote the original patches... but basically any >>>>>> situation where more than one FullExpr shares an expression evaluation >>>>>> context or when the current expression evaluation context had its >>>>>> NumTypos >>>>>> increased by an expression evaluation context that was popped off the >>>>>> stack. And TreeTransforms aren't cheap enough to blindly run on every >>>>>> Expr >>>>>> when NumTypos is non-zero. >>>>>> >>>>>>> >>>>>>> >>>>>>>> To simplify Sema::clearDelayedTypo you could add >>>>>>>> MapVector::erase(const KeyType&) (given that std::map has such an >>>>>>>> operation, it seems like a reasonable one for MapVector to have). >>>>>>>>> >>>>>>>>> Done (simple patch attached). >>>>>>>> >>>>>>> >>>>>>> Great - if you'd like to commit this separately, it should have a >>>>>>> unit test. Otherwise it's probably simple enough to commit along with >>>>>>> your >>>>>>> new usage if you prefer. >>>>>>> >>>>>> >>>>>> Committed with unit test as r219240. >>>>>> >>>>>>> >>>>>>> Sema::getTypoExprState looks like it could be a const function (it >>>>>>>>> looks like it's never meant to be called when the element isn't >>>>>>>>> already in >>>>>>>>> the map) - using 'lookup' instead of operator[] will be a const >>>>>>>>> operation >>>>>>>>> on the map (& assert if the element doesn't exist in the map). >>>>>>>>> >>>>>>>> >>>>>>>> Sadly I cannot do that because the operator[] cannot be used if the >>>>>>>> method is made const, and .lookup() can't be used because it returns >>>>>>>> by-value instead of by-reference and TypoExprState is not copyable: >>>>>>>> >>>>>>>> ../tools/clang/lib/Sema/SemaLookup.cpp:4600:10: warning: returning >>>>>>>> reference to local temporary object [-Wreturn-stack-address] >>>>>>>> return DelayedTypos.lookup(TE); >>>>>>>> >>>>>>>> ../include/llvm/ADT/MapVector.h:88:30: error: call to >>>>>>>> implicitly-deleted copy constructor of 'const >>>>>>>> clang::Sema::TypoExprState' >>>>>>>> return Pos == Map.end()? ValueT() : Vector[Pos->second].second; >>>>>>>> >>>>>>> >>>>>>> Oh, weird - my mistake in assuming/misremembering what lookup >>>>>>> actually does. I'd still make getTypoExprState const and use find+assert >>>>>>> just to constrain the interface a bit more rather than leaving it >>>>>>> possible >>>>>>> to accidentally create a new entry in the map when the API is only >>>>>>> intended >>>>>>> to query existing entries. >>>>>>> >>>>>> >>>>>> Done. The new changes I've made overall are: >>>>>> >>>>>> diff --git a/include/clang/Sema/Sema.h b/include/clang/Sema/Sema.h >>>>>> index b30bf5e..f644fdd 100644 >>>>>> --- a/include/clang/Sema/Sema.h >>>>>> +++ b/include/clang/Sema/Sema.h >>>>>> @@ -2628,7 +2628,7 @@ private: >>>>>> bool ErrorRecovery, bool >>>>>> &IsUnqualifiedLookup); >>>>>> >>>>>> public: >>>>>> - const TypoExprState &getTypoExprState(TypoExpr *TE); >>>>>> + const TypoExprState &getTypoExprState(TypoExpr *TE) const; >>>>>> >>>>>> /// \brief Clears the state of the given TypoExpr. >>>>>> void clearDelayedTypo(TypoExpr *TE); >>>>>> diff --git a/lib/Sema/SemaExprCXX.cpp b/lib/Sema/SemaExprCXX.cpp >>>>>> index 73aa997..1f08276 100644 >>>>>> --- a/lib/Sema/SemaExprCXX.cpp >>>>>> +++ b/lib/Sema/SemaExprCXX.cpp >>>>>> @@ -6028,7 +6028,7 @@ public: >>>>>> // If the TypoExpr hasn't been seen before, record it. >>>>>> Otherwise, return the >>>>>> // cached transformation result if there is one and the TypoExpr >>>>>> isn't the >>>>>> // first one that was encountered. >>>>>> - if (!TypoExprs.insert(E) && E != TypoExprs[0] && >>>>>> TransformCache.count(E)) { >>>>>> + if (!TypoExprs.insert(E) && TransformCache.count(E)) { >>>>>> return TransformCache[E]; >>>>>> } >>>>>> >>>>>> diff --git a/lib/Sema/SemaLookup.cpp b/lib/Sema/SemaLookup.cpp >>>>>> index 8ddb5fd..bb6c76a 100644 >>>>>> --- a/lib/Sema/SemaLookup.cpp >>>>>> +++ b/lib/Sema/SemaLookup.cpp >>>>>> @@ -4601,8 +4601,11 @@ TypoExpr >>>>>> *Sema::createDelayedTypo(std::unique_ptr<TypoCorrectionConsumer> TCC, >>>>>> return TE; >>>>>> } >>>>>> >>>>>> -const Sema::TypoExprState &Sema::getTypoExprState(TypoExpr *TE) { >>>>>> - return DelayedTypos[TE]; >>>>>> +const Sema::TypoExprState &Sema::getTypoExprState(TypoExpr *TE) >>>>>> const { >>>>>> + auto Entry = DelayedTypos.find(TE); >>>>>> + assert(Entry != DelayedTypos.end() && >>>>>> + "Failed to get the state for a TypoExpr!"); >>>>>> + return Entry->second; >>>>>> } >>>>>> >>>>>> void Sema::clearDelayedTypo(TypoExpr *TE) { >>>>>> >>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Otherwise this all looks like rather neat stuff - though I'm not >>>>>>>>> as able to review in detail some of the refactoring of existing typo >>>>>>>>> correction stuff. It looks plausible. >>>>>>>>> >>>>>>>> >>>>>>>> I've attached the updated patch. Thanks for reviewing this, David! >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>> On Thu, Sep 25, 2014 at 2:00 PM, Kaelyn Takata <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> ping >>>>>>>>>> >>>>>>>>>> On Tue, Aug 26, 2014 at 11:05 AM, Kaelyn Takata <[email protected] >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> +dblaikie >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Jul 31, 2014 at 1:20 PM, Kaelyn Takata <[email protected] >>>>>>>>>>> > wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Part of the infrastructure is a map from a TypoExpr to the >>>>>>>>>>>> Sema-specific >>>>>>>>>>>> state needed to correct it, along with helpers to ease dealing >>>>>>>>>>>> with the >>>>>>>>>>>> state. >>>>>>>>>>>> >>>>>>>>>>>> The the typo count is propagated up the stack of >>>>>>>>>>>> ExpressionEvaluationContextRecords when one is popped off of to >>>>>>>>>>>> avoid accidentally dropping TypoExprs on the floor. For example, >>>>>>>>>>>> the attempted correction of g() in >>>>>>>>>>>> test/CXX/class/class.mem/p5-0x.cpp >>>>>>>>>>>> happens with an ExpressionEvaluationContextRecord that is >>>>>>>>>>>> popped off >>>>>>>>>>>> the stack prior to ActOnFinishFullExpr being called and the tree >>>>>>>>>>>> transform for TypoExprs being run. >>>>>>>>>>>> --- >>>>>>>>>>>> include/clang/Sema/Sema.h | 44 +++++ >>>>>>>>>>>> include/clang/Sema/SemaInternal.h | 15 +- >>>>>>>>>>>> include/clang/Sema/TypoCorrection.h | 2 +- >>>>>>>>>>>> lib/Sema/SemaExpr.cpp | 7 + >>>>>>>>>>>> lib/Sema/SemaExprCXX.cpp | 108 ++++++++++++ >>>>>>>>>>>> lib/Sema/SemaLookup.cpp | 316 >>>>>>>>>>>> ++++++++++++++++++++++++------------ >>>>>>>>>>>> 6 files changed, 384 insertions(+), 108 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
