Ouch. Correcting myself. Ingo Schwarze wrote on Sun, Feb 16, 2014 at 03:11:07PM +0100:
> 1. I asked around a bit and Thomas Klausner (NetBSD) mentioned > that both groff and mandoc format bare, unescaped ASCII minus > characters (`-', 0x2d) found in the input stream as the > three-byte UTF-8 sequence 0xe2 0x80 0x93 in the output stream > when running with -Tutf8 or with -Tlocale and LC_CTYPE=*_*.UTF-8. Dmitrij D. Czarkoff just pointed out to me in private mail that this isn't true at all. I misunderstood what Thomas said. So i re-checked. Here is how the various dashes and hyphens actually render in both groff and mandoc: input output output ----- ASCII UTF-8 ----- ----- - - - \- - - \(hy - U+2010 \(en - U+2013 \(em -- U+2014 > That can be annoying when trying to copy and paste code examples > from formatted manual pages. Consequently, that can only happen if people use \(hy, \(en, or \(em for formatting their code examples. Hopefully, few people do that. Yours, Ingo