> Date: Sun, 23 Sep 2018 09:11:41 -0400 > From: John J. Xenakis <hew...@jxenakis.com> > Cc: jxenakis...@gmail.com > > You suggested that I use the "raw-text" coding system, implying that > these characters are random binary data. But they're actually > completely valid 8-bit characters that are commonly used in Western > media.
Then there's still something not right, because you shouldn't be having any of these problems with files that are consistently encoded. > So the net result is that emacs loads a Windows text file on a Windows > system, decides that it's really a Unix file (which it isn't), and > then really damages the file in a way that's almost impossible to > recover from. Eli, this is not something that an editor should be > doing gratuituously. It shouldn't and it doesn't. Depending on what exactly is in your files, something that is still a bit of a mystery for me, Emacs could sometimes err if you don't tell it enough. But in any case, there are commands to fix those errors right away, as soon as you realize something like that happens. We will get to that, once I understand more about the problem. > So the ad-hoc workaround is this: > > * Open the file in Notepad. All the 8-bit characters are displayed > correctly. > * Select and copy the entire text in Notepad. > * In emacs, open a new text file. > * Paste the text that you copied from Notepad. > * Save the result. > > Much to my relief, this cures all the 8-bit problems, and I can go > back to reloading and editing the file in emacs. Is it possible that the file is encoded in UTF-16 or UTF-8? What happens if you visit the file like this: C-x RET c utf-8 RET C-x C-f FILENAME RET and similarly for utf-16? Does this fix the problem? And how were those files created in the first place? I understood from your previous explanations that you created those files by copy-pasting from other applications, is that right? > So I select the character é (e with an acute accent, as in the first > letter of the French spelling of the word elite). Here is the > information that "C-x=" provides in each of the two cases, the damaged > and repaired file respectively: > > Char: \351 (4194281, #o17777751, #x3fffe9, raw-byte) point=76501 of > 343691 (22%) column=51 > > Char: é (233, #o351, #xe9, file #xE9) point=74734 of 336596 (22%) > column=51 Can you post one such file, please? It is important that you post a file as a binary attachment, and it is also important to verify that the trick with Notepad and copy/paste works with the file you post. I'm quite sure this is caused by something very simple, because Notepad is certainly not smarter than Emacs wrt encodings.