On Friday, May 9, 2014 8:12:57 PM UTC-4, Steven D'Aprano wrote: > Good: > > > > fStr = re.sub(b'‒', b'-', fStr) >
Doesn't work...the document has been verified to contain endash and emdash characters, but this does NOT replace them. > > > Better: > > > > fStr = fStr.replace(b'‒', b'-') > > Still doesn't work > > > > But having said that, you actually can make use of the nuclear-powered > > bulldozer, and do all the replacements in one go: > > > > Best: > > > > # Untested > > fStr = re.sub(b'&#x(201[2-5])|(2E3[AB])|(00[2A]D)', b'-', fStr) Still doesn't work. Guess whatever the code is for endash and mdash are not the ones I am using.... -- https://mail.python.org/mailman/listinfo/python-list