David Kastrup <[email protected]> skribis: > [email protected] (Ludovic Courtès) writes: > >> David Kastrup <[email protected]> skribis: >> >>> I'm currently migrating LilyPond over to GUILE 2.0. LilyPond has its >>> own UTF-8 verification, error flagging, processing and indexing. >> >> Do I understand correctly that LilyPond expects Guile strings to be byte >> vectors, which it can feed with UTF-8 byte sequences that it built by >> itself? > > Not really. LilyPond reads and parses its own files but it does divert > parts through GUILE occasionally in the process. Some stuff is passed > through GUILE with time delays and parts wrapped into closures and > flagged with machine-identifiable source locations.
OK. >>> If you take a look at >>> <URL:http://git.savannah.gnu.org/cgit/lilypond.git/tree/scm/parser-ly-from-scheme.scm>, >>> ftell on a string port is here used for correlating the positions of >>> parsed subexpressions with the original data. Reencoding strings in >>> utf-8 is not going to make this work with string indexing since ftell >>> does not bear a useful relation to string positions. >> >> AIUI the result of ‘ftell’ is used in only one place, while >> ‘port-line’ and ‘port-column’ are used in other places. > > The ftell information is wrapped into an alist together with a closure > corresponding to the source location. At a later point of time, the > surrounding string may be interpreted, and the source location is > correlated with the closure and the closure used instead of a call to > local-eval (which does not have the same power of evaluating materials > in a preserved lexical environment as a closure has). > >> The latter seems more appropriate to me when it comes to tracking >> source location. > > For error messages, yes. For associating a position in a string with a > previously parsed closure, no. But wouldn’t a line/column pair be as suitable as a unique identifier as the position in the file? Also, if the result of ‘ftell’ is used as a unique identifier, does it really matter whether it’s an offset measured in bytes or in character? (Trying to make sure I understand the problem.) Thanks, Ludo’.
