Unicode Support in sed?

Scott Vanderbilt Thu, 11 Aug 2016 12:59:26 -0700

I'm trying to use sed to munge some text in HTML files, convertingUnicode characters to their HTML entity equivalents, however I can'tseem to get it to work.


For instance, this command has no apparent effect:


  sed -i -e 's/\xe2\x80\x94/&mdash;/g' foo.html

Other sed operations using ASCII arguments work fine.

Does sed support Unicode in this fashion? The sed(1) man page is silent.The FAQ section on Character Sets<http://www.openbsd.org/faq/faq10.html#locales> indicates that:


   OpenBSD uses the ASCII character set by default. It also supports
   the Unicode (UTF-8) character set.

but I'm not sure what bearing that has on this issue.

Running OpenBSD 6.0 (GENERIC.MP) #2302: Sat Jul 23 09:33:37 MDT 2016 (amd64)

Many thanks in advance for any assistance.

Unicode Support in sed?

Reply via email to