xgettext parsing of Tcl unicode code point escapes is broken, it tries to replace the escape with the literal unicode character but does not consume the last character of the escape but copies it into the output which results in corrupt .po files, e.g.:
----8<---- $ cat gettext-bug.tcl #!/usr/bin/tclsh package require msgcat puts [msgcat::mc "Hello\u200e\u201cWorld\u201d"] $ /usr/bin/xgettext -o- gettext-bug.tcl # SOME DESCRIPTIVE TITLE. # Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER # This file is distributed under the same license as the PACKAGE package. # FIRST AUTHOR <EMAIL@ADDRESS>, YEAR. # #, fuzzy msgid "" msgstr "" "Project-Id-Version: PACKAGE VERSION\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2013-06-24 16:24+0200\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n" "Language-Team: LANGUAGE <[email protected]>\n" "Language: \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #: gettext-bug.tcl:5 msgid "Helloe“cWorld”d" msgstr "" ---->8---- It should probably not try to substitute these escapes at all as it results in fragile .po files with embedded control characters, see e.g. the U+200E left-to-right mark in the above example. -- Guido Berhoerster
