According to Geoff Hutchison:
> On Tue, 11 Apr 2000, Gilles Detillieux wrote:
> > so I didn't realise that tokens were context-dependent. In my University
> > courses, when we learned about lexical analysers, we learned that they
> > commonly stick to a strict type 1 grammar, which means they have no
>
> OK, so let's make up one of those nice automata they always drew in
> classes. We'll start out and go to a set of states for accepting keywords,
> which stops when we hit a ':' character (the delimeter). Now we go into
> some other set of states. To the untrained eye, the new states may look a
> *lot* like the old states since both of them accept strings. But we have a
> "memory" that these are string attribute strings and the others were
> keyword states.
Except that the way I learned it in my classes, the lexical analyser
changes states based on its input tokens, which were individual
characters, until it reached its final state, passing the string it
built to the next level (as a token for the parser) and then next time it
started up, it was back at its initial state with no memory of previous
lexical analysis of the parser's last token. One of the points that was
stressed was that for a type 1 grammar, you can't do things that involve
memory of tokens, such as for example checking for proper nesting of
brackets. That required a type 2 or type 3 grammar, which was up to the
parser, and not the lexical analyser. It seems that Lex/Flex implements a
type 2 or type 3 grammar, though. That's what threw me off track before.
> > could trigger a syntax error. Mind you, I haven't heard any complaints
> > about it rejecting next_pate_text and prev_page_text, so I must be
> > missing something here. Can anyone shed some light on this?
>
> I *think* it's again the business of strings. Once you've hit a delimeter,
> essentially everything is an attribute. But I think Vadim may be the only
> person who can answer this.
If I recall, he's no longer on the list either. Anyone know how to reach
him at his new place of work?
> > It just seems that all these "special cases" is making the code quite
> > convoluted, not to mention repetitive. Do we really need 6 sections that
> > do essentially the same concatenation, and do we need to add more if we
> > find that other token types need to be added? What I had in mind was
> > something more like this:
> > [snip]
> > If we find that we need to add other token types to lists, we just
> > need to add one entry to the "item" definition. Am I oversimplifying,
> > or introducing an ambiguity in the grammar by taking this approach?
> > I guess we'd need to add a "%type <str> item" definition above that,
> > but all other definitions would be as-is. Am I missing something?
>
> No, I'd agree here. But I hate messing around with something that I don't
> understand. So my policy was "if it works, then we can use it for 3.2.0b2
> and get it out the door." Then we can do cleanups later. It's simply a
> matter of practicality--IMHO if there's a showstopper, you fix it as best
> as you can without major changes.
>
> For example, I didn't see why there was a distinction between T_STRING
> and T_NUMBER since we do the same thing with them anayway. And yes, I
> thought of almost exactly the same code as you--it seems less convoluted.
> But at the moment, we don't have regression tests for the config
> parser, so it's hard to know if we break something unintentionally.
Yes, that's a very good point. Let's release it the way it now stands,
and see if other problems crop up. If they do, or if someone has a chance
to validate a simpler grammar, then we can change it in the next release.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.