Stefan Monnier wrote:
Well, maybe we can help, if you tell us what you know, ;-)
The mail you replied to was all I knew at the time. But here is a distilled description of the problem (I've omitted the 1025 character string):
ELISP> (setq str (string-to-multibyte <1025 ASCII character string>))
...
ELISP> (multibyte-string-p str)
t
ELISP> (multibyte-string-p (encode-coding-string str
'compound-text-with-extensions))
t <---- BUG, should be nil
ELISP> (multibyte-string-p (encode-coding-string str 'utf-8))
nil
Most applications don't ask for 'compound-text, so most of the time the xassert doesn't abort.
The compound-text case exits in the second return in encode_coding_string in coding.c in this code fragment (in the if (from == to_byte) cas near the bottom):
if (! CODING_REQUIRE_ENCODING (coding))
{
coding->consumed = SBYTES (str);
coding->consumed_char = SCHARS (str);
if (STRING_MULTIBYTE (str))
{
str = Fstring_as_unibyte (str);
nocopy = 1;
}
coding->produced = SBYTES (str);
coding->produced_char = SCHARS (str);
return (nocopy ? str : Fcopy_sequence (str));
}if (coding->composing != COMPOSITION_DISABLED) coding_save_composition (coding, from, to, str);
/* Try to skip the heading and tailing ASCIIs. We can't skip them
if we must run CCL program or there are compositions to
encode. */
if (coding->type != coding_type_ccl
&& (! coding->cmp_data || coding->cmp_data->used == 0))
{
SHRINK_CONVERSION_REGION (&from, &to_byte, coding, SDATA (str),
1);
if (from == to_byte)
{
coding_free_composition_data (coding);
return (nocopy ? str : Fcopy_sequence (str));
}
shrinked_bytes = from + (SBYTES (str) - to_byte);
}So if str is mulitbyte when it enters this function, the return value is multibyte. I suspect it is here Fstring_as_unibyte should be called, as it is in the previous early return.
Jan D.
_______________________________________________ Emacs-devel mailing list [email protected] http://lists.gnu.org/mailman/listinfo/emacs-devel
