* Daiki Ueno <[email protected]> [2013-06-25 05:58]: > Hi Guido, > > Guido Berhoerster <[email protected]> writes: > > > xgettext parsing of Tcl unicode code point escapes is broken, it tries > > to replace the escape with the literal unicode character but does not > > consume the last character of the escape but copies it into the output > > which results in corrupt .po files, e.g.: > > > > $ cat gettext-bug.tcl > > puts [msgcat::mc "Hello\u200e\u201cWorld\u201d"] > > > > $ /usr/bin/xgettext -o- gettext-bug.tcl > > #: gettext-bug.tcl:5 > > msgid "Helloe“cWorld”d" > > msgstr "" > > Thanks for the report. > > > It should probably not try to substitute these escapes at all as it > > results in fragile .po files with embedded control characters, see > > e.g. the U+200E left-to-right mark in the above example. > > I've just pushed the attached patch (\x fix in the patch is not really > necessay, sorry; partially reverted in the git).
Thanks for the quick fix, that substitution works correctly now. I still wonder why you're substituting \u escapes with unicode characters at all, as that potentially allows unescaped control sequences which make the .po file quite fragile? -- Guido Berhoerster
