On 05/14/2012 05:00 PM, Roman D. Boiko wrote:
On Saturday, 12 May 2012 at 03:32:20 UTC, Ary Manzana wrote:
I think you are wasting much more memory and performance by storing
all the tokens in the lexer.

Imagine I want to implement a simple syntax highlighter: just
highlight keywords. How can I tell DCT to *not* store all tokens
because I need each one in turn? And since I'll be highlighting in the
editor I will need column and line information. That means I'll have
to do that O(log(n)) operation for every token.

So you see, for the simplest use case of a lexer the performance of
DCT is awful.

Now imagine I want to build an AST. Again, I consume the tokens one by
one, probably peeking in some cases. If I want to store line and
column information I just copy them to the AST. You say the tokens are
discarded but their data is not, and that's why their data is usually
copied.

Currently I think about making token a class instead of struct.

...

Could anybody suggest other pros and cons? Which option would you choose?

Just use a circular buffer of value-type tokens. There is absolutely no excuse for slow parsing.

Reply via email to