Ways to tweak the TAP structure before handing it to the parser? The
"removal of leading hyphens" thing in the previous replies, sanitizing
sensitive information (maybe the non-employees shouldn't see IP
addresses in the output when they see the parsed result), who knows.

My opinion is that playing with the tokenizer output shouldn't be
discouraged. In someone breaks the structure, however, that's their
problem. You're the one with vetitive power in the end, though :-)

On 7/9/06, Ovid <[EMAIL PROTECTED]> wrote:
----- Original Message ----
From: Ian Langworth <[EMAIL PROTECTED]>

> This is cool, Ovid! I think you're definitely on the right track.

Thanks!

> Thoughts:
>
> - I'd like an option to automatically s{ \A \s* - \s+ }{} all test
> descriptions. I bet a lot of people would end up doing this
> themselves, including myself.

Hmm, I can probably add that in fairly easily.

> - Speaking of that step, the underscore in your tokenizing method
> (_lex) denotes it as private. What if I wanted to massage the tokens
> after that phase? Specifying after_lexing hooks might be useful.

Can you show me some example syntax of what you'd want to do?  The problem is, 
the lexing and parsing are very tightly coupled.  Allowing people to massage 
the lexer output directly is asking for trouble.

> - I'm not sure how closely you're trying to follow a traditional,
> multi-phase parser. If the lexer's job is to simply turn TAP output
> into data structures, why does it die on the semantic error of a
> forgotten plan? (10-lex.t line 183) Sounds like that's the job of the
> parser.

Originally I had a clearer distinction between lexing and parsing, but there 
were three reasons I abandonded this:

1.  TAP is line based and allows non-parseable junk.
2.  Context is very important and much of that is easy to track while lexing.
3.  Due to some lexing ambiguities raised by precedence issues, I found that 
the lexer would need much finer-grained tokens then I was creating and that 
made the grammar more complicated.

However, you are correct that the lexer should not be dying.  In fact, neither 
should the parser.  I decided last night that all errors should be recorded, as 
you were wanting.

> Your tests are very descriptive & well organized.

Really?  I thought I was a bit sloppy with 'em :)

Cheers,
Ovid







--
Ian Langworth

Reply via email to