I'm trying to use sed to munge some text in HTML files, converting
Unicode characters to their HTML entity equivalents, however I can't
seem to get it to work.
For instance, this command has no apparent effect:
sed -i -e 's/\xe2\x80\x94/—/g' foo.html
Other sed operations using ASCII arguments work fine.
Does sed support Unicode in this fashion? The sed(1) man page is silent.
The FAQ section on Character Sets
<http://www.openbsd.org/faq/faq10.html#locales> indicates that:
OpenBSD uses the ASCII character set by default. It also supports
the Unicode (UTF-8) character set.
but I'm not sure what bearing that has on this issue.
Running OpenBSD 6.0 (GENERIC.MP) #2302: Sat Jul 23 09:33:37 MDT 2016 (amd64)
Many thanks in advance for any assistance.