Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Philip Guenther
On Fri, Dec 1, 2017 at 11:38 AM, Stefan Sperling wrote: > On Fri, Dec 01, 2017 at 12:14:48PM +0100, Ingo Schwarze wrote: > > Anthony J. Bentley wrote on Thu, Nov 30, 2017 at 11:28:54PM -0700: > > > > > You'll need extra fonts once I finish my patch to add situationally > > >

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Stefan Sperling
On Fri, Dec 01, 2017 at 12:14:48PM +0100, Ingo Schwarze wrote: > Hi Anthony, > > Anthony J. Bentley wrote on Thu, Nov 30, 2017 at 11:28:54PM -0700: > > > You'll need extra fonts once I finish my patch to add situationally > > appropriate emoji to all our manpages. > > I'm looking forward to

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Philippe Meunier
Anthony J. Bentley wrote: >I was internally debating this earlier. The bug is already exposed by >any combining characters that don't have precomposed forms. It also >doesn't show up with the default (i.e. non TrueType) fonts. Given that >and how unfriendly the precomposition behavior is, I think

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Philippe Meunier
Anthony J. Bentley wrote: >Philippe Meunier writes: >> - When the precompose resource is set to false, copy-pasting the result of >> printf "e\xcc\x81\n" never works correctly in xterm, regardless of >> whether I use TrueType fonts or not. xterm copy-pastes the correct >> sequence of bytes

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Allan Streib
Allan Streib writes: > $ printf "e\xcc\x81\n" | od -a > 000e cc 81 nl > > $ printf "e\xcc\x81\n" > é > > ^ copy/pasting: $ echo "é" | od -a > 000 c3 a9 nl Also in case it's interesting: $ printf "e\xcc\x81" | xclip -i $ xclip -o | od -a

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Anthony J. Bentley
Ingo Schwarze writes: > Hi, > > Anthony J. Bentley wrote on Fri, Dec 01, 2017 at 08:18:59AM -0700: > > Philippe Meunier writes: > > >> - In addition, when the precompose resource is set to false and TrueType > >> fonts are used, the result of printf "e\xcc\x81\n" itself is wrong (even > >>

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Ingo Schwarze
Hi, Anthony J. Bentley wrote on Fri, Dec 01, 2017 at 08:18:59AM -0700: > Philippe Meunier writes: >> - In addition, when the precompose resource is set to false and TrueType >> fonts are used, the result of printf "e\xcc\x81\n" itself is wrong (even >> before trying to copy-paste it): od(1)

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Anthony J. Bentley
Philippe Meunier writes: > - When the precompose resource is set to false, copy-pasting the result of > printf "e\xcc\x81\n" never works correctly in xterm, regardless of > whether I use TrueType fonts or not. xterm copy-pastes the correct > sequence of bytes but that sequence is not

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Allan Streib
Philippe Meunier writes: > - Allan probably did his tests with the precompose resource set to its > default true value. I assume this is correct because I have never deliberately changed it. And you're right after all. $ printf "e\xcc\x81\n" | od -a 000e cc 81

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Anthony J. Bentley
Ingo Schwarze writes: > >> +*precompose: false > > > Sure. > > On a more serious note, i'll commit that tomorrow then > based on OK bentley@ unless somebody can point out a downside. Please update the OPENBSD SPECIFICS section of the manual as well. > Hum, i don't doubt your analysis. But now i

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Philippe Meunier
Ingo Schwarze wrote: >Hum, i don't doubt your analysis. But now i don't understand why >uxterm(1) works for Allan and plain xterm(1) doesn't... Re-reading Allan's email, it's not clear to me whether he did his tests with the precompose resource set to true or false. If using the default value

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-12-01 Thread Ingo Schwarze
Hi Anthony, Anthony J. Bentley wrote on Thu, Nov 30, 2017 at 11:28:54PM -0700: > You'll need extra fonts once I finish my patch to add situationally > appropriate emoji to all our manpages. I'm looking forward to that. Don't forget to make them animated, make the colours fully configurable,

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-30 Thread Anthony J. Bentley
Hi Ingo, Ingo Schwarze writes: > Except in a professional typesetting system like groff or LaTeX, i > consider anything that makes the end user worry about fonts > fundamentally broken. I think everybody's in agreement that xterm is broken and wrong here. > Fonts that work should be installed

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-30 Thread Ingo Schwarze
Hi, Allan Streib wrote on Thu, Nov 30, 2017 at 12:09:13PM -0500: > Philippe Meunier writes: >> Allan Streib wrote: >>> Are you using xterm(1) or uxterm(1)? >> uxterm does not exist anymore on OpenBSD 6.1: >> https://www.openbsd.org/faq/upgrade61.html > Hm. Well that's one

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-30 Thread Allan Streib
Philippe Meunier writes: > Allan Streib wrote: >>Are you using xterm(1) or uxterm(1)? > > uxterm does not exist anymore on OpenBSD 6.1: > https://www.openbsd.org/faq/upgrade61.html Hm. Well that's one that I overlooked. I've been upgrading since 5.x and I never removed

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-30 Thread Philippe Meunier
Allan Streib wrote: >Are you using xterm(1) or uxterm(1)? uxterm does not exist anymore on OpenBSD 6.1: https://www.openbsd.org/faq/upgrade61.html Philippe

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-30 Thread Allan Streib
Philippe Meunier writes: > So there seems to be two problems: > > - Copy-pasting the result of printf "e\xcc\x81\n" never works correctly in > xterm, regardless of whether I use TrueType fonts or not. xterm > copy-pastes the correct sequence of bytes but that sequence

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-30 Thread Philippe Meunier
Anthony J. Bentley wrote: >I get the same result, but only when using TrueType fonts (default or no). If I use TrueType fonts: $ printf "e\xcc\x81\n" only shows the letter 'e', and when I try to copy-paste it I get a letter 'e' followed by a question mark inside a circle. If I then redraw the

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-30 Thread Anthony J. Bentley
Philippe Meunier writes: > The strange part is that, when I copy the first filename and paste > it to become the second filename, the second filename is shown without > any accent, even though the first and second filenames are now the exact > same sequence of bytes (I checked using od(1)). So on

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-29 Thread Philippe Meunier
Anthony J. Bentley wrote: > precompose (class Precompose) Thanks! That makes xterm work (almost) as expected: $ ls Thérèse $ ls | od -c 000T h e 314 201 r e 314 200 s e \n 014 $ cp Thérèse Thérèse cp: Thérèse and Thérèse are identical

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-29 Thread Stefan Sperling
On Wed, Nov 29, 2017 at 07:05:05PM +0100, Ingo Schwarze wrote: > Anthony J. Bentley wrote on Wed, Nov 29, 2017 at 10:29:28AM -0700: > > The only unexpected thing here is xterm doing these transformations > > without asking. > > I think i would support a diff to fix that Seconded. The current

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-29 Thread Ingo Schwarze
Hi Anthony, Anthony J. Bentley wrote on Wed, Nov 29, 2017 at 10:29:28AM -0700: > Ingo Schwarze writes: >> That's a bad idea. Do not use non-ASCII bytes in file names. >> You are in for all kinds of trouble. > I don't agree. In a situation where a single user will be accessing > files, That's

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-29 Thread Anthony J. Bentley
Ingo Schwarze writes: > That's a bad idea. Do not use non-ASCII bytes in file names. > You are in for all kinds of trouble. I don't agree. In a situation where a single user will be accessing files, you can use whatever naming scheme you like. UTF-8 works exactly how you would expect: the

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-29 Thread Ingo Schwarze
Hi Philippe, Philippe Meunier wrote on Wed, Nov 29, 2017 at 11:35:59AM -0500: > Ingo Schwarze wrote: >> Philippe Meunier wrote: >>> $ ls >>> Thérèse >> That's a bad idea. Do not use non-ASCII bytes in file names. > That's a nice thought but in practice I have some files on that machine >

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-29 Thread Philippe Meunier
Ingo Schwarze wrote: >Philippe Meunier wrote: >> $ ls >> Thérèse > >That's a bad idea. Do not use non-ASCII bytes in file names. That's a nice thought but in practice I have some files on that machine with names written in French, Thai, Chinese, Korean, and Japanese, and for some of these

Re: xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-29 Thread Ingo Schwarze
Hi Philippe, Philippe Meunier wrote on Wed, Nov 29, 2017 at 09:11:38AM -0500: > I've noticed something unexpected when copy-pasting UTF-8 characters in > xterm: xterm seems to change some of the characters into something > different but visually similar. Here's an example (using ksh): > > $

xterm(1) changing UTF-8 characters when copy-pasting?

2017-11-29 Thread Philippe Meunier
Hello, I've noticed something unexpected when copy-pasting UTF-8 characters in xterm: xterm seems to change some of the characters into something different but visually similar. Here's an example (using ksh): $ uname -a OpenBSD foo.my.domain 6.1 GENERIC#19 i386 $ ls Thérèse $ ls | od -c