On Thursday, August 02, 2012 11:08:23 Walter Bright wrote: > The tokens are not kept, correct. But the identifier strings, and the string > literals, are kept, and if they are slices into the input buffer, then > everything I said applies.
String literals often _can't_ be slices unless you leave them in their original state rather than giving the version that they translate to (e.g. leaving \© in the string rather than replacing it with its actual, unicode value). And since you're not going to be able to create the literal using whatever type the range is unless it's a string of some variety, that means that the literals often can't be slices, which - depending on the implementation - would make it so that that they can't _ever_ be slices. Identifiers are a different story, since they don't have to be translated at all, but regardless of whether keeping a slice would be better than creating a new string, the identifier table will be far superior, since then you only need one copy of each identifier. So, it ultimately doesn't make sense to use slices in either case even without considering issues like them being spread across memory. The only place that I'd expect a slice in a token is in the string which represents the text which was lexed, and that won't normally be kept around. - Jonathan M Davis