Re: Syntax error if paragraph contains more than 1 printable character

James K. Lowden Wed, 13 Dec 2023 15:43:18 -0800

On Tue, 12 Dec 2023 23:06:14 -0500
Steve Litt <sl...@troubleshooters.com> wrote:


> I've already split paratext into multiple LINE tokens which represent
> a line without its NL, and now I'm thinking of splitting line into
> multiple chars ("[^\n]"). Perhaps this will make the rules less
> complicated, though longer.

Have the scanner return two tokens only: 

        LINE  a line of text, no newline
        SEP   a blank line

The lexer might have:

.+/\n  { ... return LINE; }
(\n[[:blank:]]*){2,} { return SEP; } // two or more blank lines
\n       { /* ignore */ }

Then your parser wants:

top:    paragraphs
        | paragraphs SEP // to allow for trailing blank lines
        ;
paragraphs: paragraph
        | paragraphs SEP paragraph
        ;
paragraph: lines
        ;
lines: LINE
        | lines LINE
        ;

I would think that would work.  

--jkl

Re: Syntax error if paragraph contains more than 1 printable character

Reply via email to