Arnaud Delobelle wrote:

> On 3 December 2011 23:51, Peter Otten <__pete...@web.de> wrote:
>> Arnaud Delobelle wrote:
>>
>>> I need to generate some java .properties files in Python (2.6 / 2.7).
>>> It's a simple format to store key/value pairs e.g.
>>>
>>> blue=bleu
>>> green=vert
>>> red=rouge
>>>
>>> The key/value are unicode strings.  The annoying thing is that the
>>> file is encoded in ISO 8859-1, with all non Latin1 characters escaped
>>> in the form \uHHHH (same as how unicode characters are escaped in
>>> Python).
>>>
>>> I thought I could use the "unicode_escape" codec.  But it doesn't work
>>> because it escapes Latin1 characters with escape sequences of the form
>>> \xHH, which is not valid in a java .properties file.
>>>
>>> Is there a simple way to achieve this? I could do something like this:
>>>
>>> def encode(u):
>>> """encode a unicode string in .properties format"""
>>> return u"".join(u"\\u%04x" % ord(c) if ord(c) > 0xFF else c for c
>>> in u).encode("latin_1")
>>>
>>> but it would be quite inefficient as I have many to generate.
>>
>>>>> class D(dict):
>> ...     def __missing__(self, key):
>> ...             result = self[key] = u"\\u%04x" % key
>> ...             return result
>> ...
>>>>> d = D(enumerate(map(unichr, range(256))))
>>>>> u"ähnlich üblich nötig ΦΧΨ"
>> u'\xe4hnlich \xfcblich n\xf6tig \u03a6\u03a7\u03a8'
>>>>> u"ähnlich üblich nötig ΦΧΨ".translate(d)
>> u'\xe4hnlich \xfcblich n\xf6tig \\u03a6\\u03a7\\u03a8'
>>>>> u"ähnlich üblich nötig ΦΧΨ".translate(d).encode("latin1")
>> '\xe4hnlich \xfcblich n\xf6tig \\u03a6\\u03a7\\u03a8'
> 
> A very nice solution - thanks, Peter.

I found another one:

>>> u"äöü ΦΧΨ".encode("latin1", "backslashreplace")
'\xe4\xf6\xfc \\u03a6\\u03a7\\u03a8'


-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to