On Tue, 11 Apr 2000, Gilles Detillieux wrote:

> to just the token type.  I have to admit I have fairly minimal experience
> with YACC/Bison, and a lot less with Lex, so to some extent I'm fumbling
> along here.  I hadn't looked very closely at the Lex code before today,

Well, we're probably about equal in experience. ;-)

> so I didn't realise that tokens were context-dependent.  In my University
> courses, when we learned about lexical analysers, we learned that they
> commonly stick to a strict type 1 grammar, which means they have no

OK, so let's make up one of those nice automata they always drew in
classes. We'll start out and go to a set of states for accepting keywords,
which stops when we hit a ':' character (the delimeter). Now we go into
some other set of states. To the untrained eye, the new states may look a
*lot* like the old states since both of them accept strings. But we have a
"memory" that these are string attribute strings and the others were
keyword states.

> could trigger a syntax error.  Mind you, I haven't heard any complaints
> about it rejecting next_pate_text and prev_page_text, so I must be
> missing something here.  Can anyone shed some light on this?

I *think* it's again the business of strings. Once you've hit a delimeter,
essentially everything is an attribute. But I think Vadim may be the only
person who can answer this.

> It just seems that all these "special cases" is making the code quite
> convoluted, not to mention repetitive.  Do we really need 6 sections that
> do essentially the same concatenation, and do we need to add more if we
> find that other token types need to be added?  What I had in mind was
> something more like this:
> [snip]
> If we find that we need to add other token types to lists, we just
> need to add one entry to the "item" definition.  Am I oversimplifying,
> or introducing an ambiguity in the grammar by taking this approach?
> I guess we'd need to add a "%type <str> item" definition above that,
> but all other definitions would be as-is.  Am I missing something?

No, I'd agree here. But I hate messing around with something that I don't
understand. So my policy was "if it works, then we can use it for 3.2.0b2
and get it out the door." Then we can do cleanups later. It's simply a
matter of practicality--IMHO if there's a showstopper, you fix it as best
as you can without major changes.

For example, I didn't see why there was a distinction between T_STRING
and T_NUMBER since we do the same thing with them anayway. And yes, I
thought of almost exactly the same code as you--it seems less convoluted.
But at the moment, we don't have regression tests for the config
parser, so it's hard to know if we break something unintentionally.

-Geoff


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 


Reply via email to