Currently the spec, as described in sweet.g, expects some form of
"preprocessor".

Perhaps we can actually concretely define such a preprocessor for the
core parser?

Here's my proposal:

(preprocessor
  port
  neoteric-read ; so we can use Scheme read while experimenting with it
  (lambda (get-token)
    (let ((token (get-token)))
      (do-whatever-with token))))

The token returned by (get-token) is one of the following forms:

INITIAL_INDENT_WITH_BANG
INITIAL_INDENT_NO_BANG
INDENT
DEDENT
BADDENT
SUBLIST
GROUP_SPLICE
RESTART_BEGIN
RESTART_END
EOF
hspace
comment_eol
scomment
(n-expr ,<datum>)

The logic of preprocessor's get-token function is this:

We keep track of a stack of indentations (indent-stack).  We also keep
track of whether we just recently consumed a newline (line-start), and
a numerical number of pending dedents (pending-dedents).  Initially,
indent-stack is '(), line-start is #t, and pending-dedents is 0.

get-token promises to only use peek-char and read-char (i.e. one
character lookahead).

If pending-dedents is non-zero, decrement it and return 'DEDENT.

If eof-object?, check if indent-stack is '().  If it is, return 'EOF.
Otherwise, count the number of items in the indent-stack, set
pending-indents to the length minus 1, and return 'DEDENT.

When a ; or newline is found, consume until newline, set line-start to
#t, and return 'comment_eol.

If at line-start, and (not (null? indent-stack)), clear line-start to
#f, consume indent characters (space, tab, !) and then:
- if first non-indent character is ";" or newline, consume until
newline, set line-start to #t, and return 'comment_eol.
- otherwise, update the indent-stack as needed:
- - If the current indent is incompatible with the top-most indent,
return BADDENT.
- - If the current indent is greater than the top-most indent, push it
on the indent-stack and return 'INDENT.
- - If the current indent is the same as the top-most indent, recurse
into (get-token) [or, if the BNF uses SAME, return 'SAME].
- - If the current indent is less than the top-most indent, pop off
indent-stack items (counting the number of pop-offs) until the stack
top is equal or less than the current indent - if stack-top is less,
we got a bad indent and return BADDENT, if stack-top is equal, record
the number of pop-offs - 1 into pending-dedents and return DEDENT; an
empty indent-stack is equivalent to "" for this handling.

(the expectation is that BADDENT will always be an error)

If at line-start, and the first character is an indent character
(space, tab, !), clear line-start to #t and consume indent characters.
 This is the "initial-indent" case - there is no indent-stack yet -
so return 'INITIAL_INDENT_WITH_BANG or INITIAL_INDENT_NO_BANG as
appropriate.

(the expectation is that INITIAL_INDENT_* will stop token processing,
i.e. get-token will not be called any more; in the
INITIAL_INDENT_NO_BANG it's expected that the caller will use the
ordinary Scheme read on the port)

If the character is a horizontal space, consume it and return 'hspace.

If the character is a "{" or "(" or "[", then return `(n-expr
,(neoteric-read port))

[TODO: #-handling.]

Otherwise, call neoteric-read.  If it returns $, return 'SUBLIST, \\
-> 'GROUP_SPLICE.  For <* and *>, we may need to have an
indent-stack-stack, and additional state for the extra tokens that
RESTART_END requires.  If it's not one of the special symbols, return
`(n-expr ,<datum>).

--

Assumptions:

1.  neoteric-read will not consume any whitespace or newlines after
it.  In particular, if neoteric-read is given "foo bar", it will
return 'foo and leave the port at " bar", including the space before
bar.

2.  BADDENT and INITIAL_INDENT_* will not cause get-token to get called again.

--

What you think?

Sincerely,
AmkG

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Readable-discuss mailing list
Readable-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/readable-discuss

Reply via email to