Le 03/08/2012 21:59, Walter Bright a écrit :
On 8/3/2012 6:18 AM, deadalnix wrote:
lexer can have a parameter that tell if it should build a table of
token or
slice the input. The second is important, for instance for an IDE :
lexing will
occur often, and you prefer slicing here because you already have the
source
file in memory anyway.
A string may span multiple lines - IDEs do not store the text as one
string.
If the lexer allocate chunks, it will reuse the same memory location
for the
same string. Considering the following mecanism to compare slice, this
will
require 2 comparaisons for identifier lexed with that method :
if(a.length != b.length) return false;
if(a.ptr == b.ptr) return true;
// Regular char by char comparison.
Is that a suitable option ?
You're talking about doing for strings what is done for identifiers -
returning a unique handle for each. I don't think this works very well
for string literals, as there seem to be few duplicates.
That option have the benefice to allow very fast identifier comparison
(like DMD does) but don't impose it. For instance, you could use that
trick in a single thread, but another identifier table for another.
It allow to avoid completely the problem with multithreading you
mention, while keeping most identifiers comparison really fast.
It allow also for several allocation scheme for the slice, that fit
different needs, as shown by Christophe Travert.