> On 17 Jun 2019, at 18:06, Akim Demaille <[email protected]> wrote: > > Hi Hans,
Hi, >> Le 17 juin 2019 à 15:12, Hans Åberg <[email protected]> a écrit : >> >> When a byte with high bit set that is not used in the grammar, the parser >> generated by Bison 3.4.1, does not report an error, only if the high bit is >> not set. > > This is hard to believe. I suspect your problem is elsewhere. > >> This occurs if one sets a Flex default rule >> . { return yytext[0]; } >> and the lexer finds a stray UTF-8 byte. > > I would say that here, you return a char (yytext[0]) with "a high bit set", > on an architecture where char is signed, so you are actually returning a > negative int (when the 8th bit is set). And for Bison, any negative token > number stands for end-of-file. Indeed, likely the case. > You should actually write: > > . { return (unsigned char) yytext[0]; } As 8-bit character tokens are not useful with UTF-8, I have replaced it with: %token token_error "token error" . { return my_parser::token::token_error; } Please let me know if there is a better way to generate a parser error.
