Antonio Ospite <a...@ao2.it> writes:

> ---------------------------------------------------------------------
> (process:18706): Pango-WARNING **: Invalid UTF-8 string passed to 
> pango_layout_set_text()
> ---------------------------------------------------------------------
>
> and in the final files only a part of the "büüh" string was rendered,
> however the "ü" was rendered correctly.
>
> So I added a printout to see what was going on:
>
> ---------------------------------------------------------------------
> diff --git a/lily/lily-guile.cc b/lily/lily-guile.cc
> index 2c519ec..9c0c10c 100644
> --- a/lily/lily-guile.cc
> +++ b/lily/lily-guile.cc
> @@ -132,6 +132,7 @@ ly_scm2string (SCM str)
>        result.resize (len);
>        scm_to_locale_stringbuf (str, &result.at (0), len);
>      }
> +  fprintf(stderr, "%s: len: %d result: '%s'\n", __func__, len, 
> result.c_str());
>    return result;
>  }
> ---------------------------------------------------------------------
>
> with guile-1.8:
> ---------------------------------------------------------------------
> ly_scm2string: len: 6 result: 'büüh'
> ---------------------------------------------------------------------
>
> with guile-2.0:
> ---------------------------------------------------------------------
> ly_scm2string: len: 4 result: 'bü�'
>
> (process:18706): Pango-WARNING **: Invalid UTF-8 string passed to 
> pango_layout_set_text()
> ---------------------------------------------------------------------
>
> In ly_scm2string() I see that scm_c_string_length() is used, by looking
> at the documentation
> (https://www.gnu.org/software/guile/manual/html_node/String-Selection.html#String-Selection)
> I read:
>
>       Return the number of characters in string.
>
> So 4 characters looks correct to me, even if they take 6 bytes.
>
> IMHO it can be safer not to mix scm_c_string_length() and
> scm_to_locale_stringbuf().

I've just done a git grep of ly_scm2string and even if you fix that bug,
most uses of it should _not_ use the current locale.  So obviously
ly_scm2string needs to get split into several different functions.  The
current locale should only be used for writing to the _console_.
Possibly also for writing to the log file.  For everything else,
LilyPond is likely utf-8 (or Latin-1 for efficiency reasons when
LilyPond _knows_ that only the common ASCII subset of utf-8 and Latin-1
is being used).

-- 
David Kastrup

_______________________________________________
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel

Reply via email to