New submission from Pekka Pessi <[EMAIL PROTECTED]>:

When the repo contnets are shown with darcs list, the file names that
contain 8-bit chars (UTF-8 or ISO-8859-* or whatever) are converted to
UTF-8 as if they are ISO-8859-1.

For example, file named "Ääliö älä lyö ööliä läikkyy" in 8859-1 is
byte string \e4\e4\6c\69\f6 ...
It is shown with, e.g., darcs changes --summary as quoted bytestring
[_\e4_][_\e4_]li[_\f6_] ...

With darcs list files it is shown as
./[_\c3_][_\a4_][_\c3_][_\a4_]li[_\c3_][_\b6_] (iow, it has been
converted into utf-8 as iso-8859-1).

If the file name is encoded in utf-8, it has bytestring
\c3\84\c3\a4\6c\69\c3\b6 (each accented char is now encoded in two
bytes). It is shown with, e.g., darcs changes --summary as quoted
bytestring  [_\c3_][_\84_][_\c3_][_\a4_]li[_\c3_][_\b6_]

However, with darcs list files it is shown as

[_\c3_][_\83_][_\c2_][_\84_][_\c3_][_\83_][_\c2_][_\a4_]li[_\c3_][_\83_][_\c2_][_\b6_]

that is, darcs list assumes that the bytestring is a ISO-8859-1 string
and converts it into UTF-8.

A script output from utf-8 terminal is attached.

----------
files: list-files-utf8
messages: 2385
nosy: beschmi, droundy, kowey, ppessi, tommy
status: unread
title: list files converts names to utf-8 even if they already are utf-8

__________________________________
Darcs bug tracker <[EMAIL PROTECTED]>
<http://bugs.darcs.net/issue580>
__________________________________

Attachment: list-files-utf8
Description: Binary data

_______________________________________________
darcs-devel mailing list
darcs-devel@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-devel

Reply via email to