Re: xhtml encoding question

2012-02-02 Thread Peter Otten
Ulrich Eckhardt wrote: Am 01.02.2012 10:32, schrieb Peter Otten: It doesn't matter for the OP (see Stefan Behnel's post), but If you want to replace characters in a unicode string the best way is probably the translate() method: print u\xa9\u2122 ©™ u\xa9\u2122.translate({0xa9: ucopy;,

Re: xhtml encoding question

2012-02-02 Thread Ulrich Eckhardt
Am 02.02.2012 12:02, schrieb Peter Otten: Ulrich Eckhardt wrote: u'abc'.translate({u'a': u'A'}) u'abc' I would call this a chance to improve Python. According to the documentation, using a string [as key] is invalid, but it neither raises an exception nor does it do the obvious and accept

Re: xhtml encoding question

2012-02-01 Thread Stefan Behnel
Tim Arnold, 31.01.2012 19:09: I have to follow a specification for producing xhtml files. The original files are in cp1252 encoding and I must reencode them to utf-8. Also, I have to replace certain characters with html entities. I think I've got this right, but I'd like to hear if there's

Re: xhtml encoding question

2012-02-01 Thread Ulrich Eckhardt
Am 31.01.2012 19:09, schrieb Tim Arnold: high_chars = { 0x2014:'mdash;', # 'EM DASH', 0x2013:'ndash;', # 'EN DASH', 0x0160:'Scaron;',# 'LATIN CAPITAL LETTER S WITH CARON', 0x201d:'rdquo;', # 'RIGHT DOUBLE QUOTATION MARK', 0x201c:'ldquo;', # 'LEFT DOUBLE QUOTATION MARK',

Re: xhtml encoding question

2012-02-01 Thread Peter Otten
Ulrich Eckhardt wrote: Am 31.01.2012 19:09, schrieb Tim Arnold: high_chars = { 0x2014:'mdash;', # 'EM DASH', 0x2013:'ndash;', # 'EN DASH', 0x0160:'Scaron;',# 'LATIN CAPITAL LETTER S WITH CARON', 0x201d:'rdquo;', # 'RIGHT DOUBLE QUOTATION MARK', 0x201c:'ldquo;', # 'LEFT

Re: xhtml encoding question

2012-02-01 Thread Ulrich Eckhardt
Am 01.02.2012 10:32, schrieb Peter Otten: It doesn't matter for the OP (see Stefan Behnel's post), but If you want to replace characters in a unicode string the best way is probably the translate() method: print u\xa9\u2122 ©™ u\xa9\u2122.translate({0xa9: ucopy;, 0x2122: utrade;})

Re: xhtml encoding question

2012-02-01 Thread Tim Arnold
On 2/1/2012 3:26 AM, Stefan Behnel wrote: Tim Arnold, 31.01.2012 19:09: I have to follow a specification for producing xhtml files. The original files are in cp1252 encoding and I must reencode them to utf-8. Also, I have to replace certain characters with html entities.

Re: xhtml encoding question

2012-02-01 Thread Stefan Behnel
Tim Arnold, 01.02.2012 19:15: On 2/1/2012 3:26 AM, Stefan Behnel wrote: Tim Arnold, 31.01.2012 19:09: I have to follow a specification for producing xhtml files. The original files are in cp1252 encoding and I must reencode them to utf-8. Also, I have to replace certain characters with html

xhtml encoding question

2012-01-31 Thread Tim Arnold
I have to follow a specification for producing xhtml files. The original files are in cp1252 encoding and I must reencode them to utf-8. Also, I have to replace certain characters with html entities. I think I've got this right, but I'd like to hear if there's something I'm doing that is