On Jan 24, 2013, at 13:34 , Dmitri Gribenko <[email protected]> wrote:
> On Thu, Jan 24, 2013 at 10:50 PM, Jordan Rose <[email protected]> wrote: >> Author: jrose >> Date: Thu Jan 24 14:50:46 2013 >> New Revision: 173369 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=173369&view=rev >> Log: >> Handle universal character names and Unicode characters outside of literals. >> >> This is a missing piece for C99 conformance. >> >> This patch handles UCNs by adding a '\\' case to LexTokenInternal and >> LexIdentifier -- if we see a backslash, we tentatively try to read in a UCN. >> If the UCN is not syntactically well-formed, we fall back to the old >> treatment: a backslash followed by an identifier beginning with 'u' (or 'U'). >> >> Because the spelling of an identifier with UCNs still has the UCN in it, we >> need to convert that to UTF-8 in Preprocessor::LookUpIdentifierInfo. >> >> Of course, valid code that does *not* use UCNs will see only a very minimal >> performance hit (checks after each identifier for non-ASCII characters, >> checks when converting raw_identifiers to identifiers that they do not >> contain UCNs, and checks when getting the spelling of an identifier that it >> does not contain a UCN). >> >> This patch also adds basic support for actual UTF-8 in the source. This is >> treated almost exactly the same as UCNs except that we consider stray >> Unicode characters to be mistakes and offer a fixit to remove them. >> + // Instead of letting the parser complain about the unknown token, >> + // just warn that we don't have valid UTF-8, then drop the character. > > The comment says 'just warn', but we throw an error here: > >> + if (!isLexingRawMode()) >> + Diag(CurPtr, diag::err_invalid_utf8); Yup. We're allowed to do this one because we get to map non-ASCII characters down to ASCII however we want, and we can map them to an invalid ASCII character. At least, that was my understanding of Richard's comments. _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
