Re: Syntax error if paragraph contains more than 1 printable character

Steve Litt Wed, 13 Dec 2023 16:02:41 -0800

James K. Lowden said on Tue, 12 Dec 2023 20:24:35 -0500

>On Tue, 12 Dec 2023 23:06:14 -0500
>Steve Litt <[email protected]> wrote:
>
>> I've already split paratext into multiple LINE tokens which represent
>> a line without its NL, and now I'm thinking of splitting line into
>> multiple chars ("[^\n]"). Perhaps this will make the rules less
>> complicated, though longer.  
>
>Have the scanner return two tokens only: 
>
>       LINE  a line of text, no newline
>       SEP   a blank line
>
>The lexer might have:
>
>.+/\n  { ... return LINE; }
>(\n[[:blank:]]*){2,} { return SEP; } // two or more blank lines
>\n       { /* ignore */ }


Thanks James, this looks great!

I won't need to consider end of line spaces because I now have a sed 1
liner preprocessor that gets rid of trailing space :-).

Right now I've gone back to the Hello World stage and am making a
Flex/Bison scanner that does nothing but copy the file. Once I learn
from that, I'll try your suggestions. They look refreshingly simple and
understandable to me.

Thanks much,

SteveT

Steve Litt 

Autumn 2023 featured book: Rapid Learning for the 21st Century
http://www.troubleshooters.com/rl21

Re: Syntax error if paragraph contains more than 1 printable character

Reply via email to