On 5/12/12 12:17 PM, Roman D. Boiko wrote:
On Saturday, 12 May 2012 at 03:32:20 UTC, Ary Manzana wrote:
As deadalnix says, I think you are over-complicating things.

I mean, to store the column and line information it's just:

if (isNewLine(c)) {
line++;
column = 0;
} else {
column++;
}

(I think you need to add that to the SourceRange class. Then copy line
and column to token on the Lexer#lex() method)

Do you really think it's that costly in terms of performance?

I think you are wasting much more memory and performance by storing
all the tokens in the lexer.

Imagine I want to implement a simple syntax highlighter: just
highlight keywords. How can I tell DCT to *not* store all tokens
because I need each one in turn? And since I'll be highlighting in the
editor I will need column and line information. That means I'll have
to do that O(log(n)) operation for every token.

So you see, for the simplest use case of a lexer the performance of
DCT is awful.

Now imagine I want to build an AST. Again, I consume the tokens one by
one, probably peeking in some cases. If I want to store line and
column information I just copy them to the AST. You say the tokens are
discarded but their data is not, and that's why their data is usually
copied.

Would it be possible for you to fork my code and tweak it for
comparison? You will definitely discover more problems this way, and
such feedback would really help me. That doesn't seem to be inadequately
much work to do.

Any other volunteers for this?

Sure, I'll do it and provide some benchmarks. Thanks for creating the issue.

Reply via email to