Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2012-01-02 Thread dak
On 2012/01/01 22:06:52, Keith wrote: On 2012/01/01 10:12:27, dak wrote: Sorry, I wasn't making much sense. As a reader I want to *recognize* what the but switch/case is doing rather than trying to figure it out. Maybe : // Test if these bytes are a UTF-8 encoding of a Unicode character,

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2012-01-01 Thread dak
Reviewers: lemzwerg, Keith, carl.d.sorensen_gmail.com, Message: On 2012/01/01 02:01:11, Keith wrote: Works nicely. Showing the input location will probably be very helpful. We probably want to remove the similar message from lily/misc.cc, because both message together are very noisy.

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2012-01-01 Thread dak
http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll File lily/lexer.ll (right): http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode134 lily/lexer.ll:134: A[a-zA-Z\200-\377] On 2012/01/01 02:01:11, Keith wrote: non-ASCII characters are used internally as-read,

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2012-01-01 Thread Keith OHara
On Sun, 01 Jan 2012 01:40:36 -0800, d...@gnu.org wrote: Our lexer has been written with the decision of using non-compressed tables and without backing up. Off topic, but maybe interesting to you: I don't think that decision was ever implemented. I don't see any %option full or similar that

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2012-01-01 Thread dak
On 2012/01/01 10:05:42, Keith wrote: On Sun, 01 Jan 2012 01:40:36 -0800, mailto:d...@gnu.org wrote: Our lexer has been written with the decision of using non-compressed tables and without backing up. Off topic, but maybe interesting to you: I don't think that decision was ever

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2012-01-01 Thread Keith OHara
On Sun, 01 Jan 2012 01:56:04 -0800, d...@gnu.org wrote: http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode134 lily/lexer.ll:134: A[a-zA-Z\200-\377] On 2012/01/01 02:01:11, Keith wrote: non-ASCII characters are used internally as-read, tested below only to warn the

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2012-01-01 Thread dak
On 2012/01/01 10:16:47, Keith wrote: On Sun, 01 Jan 2012 01:56:04 -0800, mailto:d...@gnu.org wrote: http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode1083 lily/lexer.ll:1083: LexerWarning (_ (non-UTF-8 characters).c_str ()); On 2012/01/01 02:01:11, Keith wrote: I

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2012-01-01 Thread k-ohara5a5a
On 2012/01/01 10:12:27, dak wrote: Consider a comment in your case/switch statement that points to some reference on the various types of UTF-8 validators. I don't understand. Sorry, I wasn't making much sense. As a reader I want to *recognize* what the but switch/case is doing rather

lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2011-12-31 Thread lemzwerg
LGTM. http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll File lily/lexer.ll (right): http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode1009 lily/lexer.ll:1009: case 0xc2: Wouldn't it be more effective to create an array of 128 bytes (for 0x80-0xFF) which maps `p[i]' to

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2011-12-31 Thread k-ohara5a5a
Works nicely. Showing the input location will probably be very helpful. We probably want to remove the similar message from lily/misc.cc, because both message together are very noisy. I wish I could think of a way to check the input with a canned regular expression like

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)

2011-12-31 Thread Carl . D . Sorensen
LGTM Carl http://codereview.appspot.com/5505090/ ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel