Nick Sabalausky wrote:
"Walter Bright" <newshou...@digitalmars.com> wrote in message
news:ia3c3r$14k...@digitalmars.com...
Does Goldie's lexer not convert numeric literals to integer values?
Are all tokens returned as strings?
Goldie's lexer (and parser) are based on the GOLD system (
http://www.devincook.com/goldparser/ ) which is deliberately independent of
both grammar and implementation language. As such, it doesn't know anything
about what the specific terminals actually represent (There are 4 exceptions
though: Comment tokens, Whitespace tokens, an "Error" token (ie, for lex
errors), and the EOF token.) So the lexed data is always represented as a
string.
Although, the lexer actually returns an array of "class Token" (
http://www.semitwist.com/goldiedocs/current/Docs/APIRef/Token/#Token ). To
get the original data that got lexed or parsed into that token, you call
"toString()". (BTW, there are currently different "modes" of "toString()"
for non-terminals, but I'm considering just ripping them all out and
replacing them with a single "return a slice from the start of the first
terminal to the end of the last terminal" - unless you think it would be
useful to get a representation of the non-terminal's original data sans
comments/whitespace, or with comments/whitespace converted to a single
space.)
I'm not sure that calling "to!whatever(token.toString())" is really all that
much of a problem for user code.
Consider a string literal, say "abc\"def". With Goldie's method, I infer this
string has to be scanned twice. Once to find its limits, and the second to
convert it to the actual string. The latter is user code and will have to
replicate whatever Goldie did.
If I may suggest, leave the low level stuff out of the api until demand
for it justifies it. It's hard to predict just what will be useful, so I
suggest conservatism rather than kitchen sink. It can always be added
later, but it's really hard to remove.
That may be a good idea.
What Goldie will be compared against is Spirit. Spirit is a reasonably
successful add-on to C++. Goldie doesn't have to do things the same way as
Spirit (expression templates - ugh), but it should be as easy to use and at
least as powerful.
That too, but I meant a clutter of files. Long files aren't a problem with
D.
Well, again, it may not be a problem with DMD, but I really think
reading/editing a long file is a pain regardless of language. Maybe we just
have different ideas of "long file"? To put it into numbers: At the moment,
Goldie's library (not counting tools and the optional generated
"static-mode" files) is about 3200 lines, including comment/blank lines.
That size would be pretty unwieldy to maintain as a single source file,
particularly since Goldie has a natural internal organization.
Actually, I think 3200 lines is of moderate, not large, size :-)
Personally, I'd much rather have a clutter of source files than a cluttered
source file. (But of course, I don't go to Java extremes and put *every*
tiny little thing in a separate file.) As long as the complexity of having
multiple files isn't passed along to user code (hence the frequent "module
foo.all" idiom), then I can't say I really see a problem.
I tend to just not like having to constantly grep to see which file XXX is in.