Re: Encoding trouble
On Tue, Feb 01 2022 at 12:08 +01, Arash Esbati wrote: > Denis Bitouzé writes: > >> several years ago, I already faced the following problem and, >> unfortunately, it happened again yesterday, which made me lose quite >> some time. >> >> Let me explain myself: I had a LaTeX file encoded in latin1 that >> I wanted to encode in UTF-8. I used an external tool, in this case >> `utrac`, > > I think Emacs got upset because it saw you used an external tool -- > hence the punishment What about local variables? All my files have %%% Local Variables: %%% coding: utf-8-unix %%% mode: latex %%% TeX-master: t %%% TeX-PDF-mode: t %%% TeX-engine: xetex %%% End: Regards - Janusz -- , Janusz S. Bien emeryt (emeritus) https://sites.google.com/view/jsbien
Re: Encoding trouble
Denis Bitouzé writes: > several years ago, I already faced the following problem and, > unfortunately, it happened again yesterday, which made me lose quite > some time. > > Let me explain myself: I had a LaTeX file encoded in latin1 that > I wanted to encode in UTF-8. I used an external tool, in this case > `utrac`, I think Emacs got upset because it saw you used an external tool -- hence the punishment > which confirmed the starting (latin1) and ending (UTF-8) > encoding. But, when I opened this file in Emacs with AUCTeX enabled, > the accented characters were wrong and it was only when I saw that the > file contained `usepackage[latin1]{inputenc}` that I understood where > the problem came from: changing it in `usepackage[utf8]{inputenc}` > solved it. I'm not even sure that AUCTeX has code to deal with a .tex file which already contains \usepackage[]{inputenc}. The only thing I'm aware of is when you type 'C-c C-m usepackage RET inputenc RET ENC RET' that AUCTeX changes the file encoding acc. to chose ENC. > So, here is my request: would it be possible that, for the detection of > the real encoding of the file, AUCTeX relies not on the `inputenc` > package option, but rather on the Emacs heuristics and that, in case of > discrepancy between the two, it issues a warning? Have a look at the variable `file-coding-system-alist'. If you don't want automatic conversion based on 'inputenc', remove the entry ("\\.\\(tex\\|ltx\\|dtx\\|drv\\)\\'" . latexenc-find-file-coding-system) Maybe that helps. Best, Arash
Re: Encoding trouble
Hi Denis, > Denis Bitouzé writes: > Let me explain myself: I had a LaTeX file encoded in latin1 that > I wanted to encode in UTF-8. I used an external tool, in this case > `utrac`, which confirmed the starting (latin1) and ending (UTF-8) > encoding. But, when I opened this file in Emacs with AUCTeX enabled, the > accented characters were wrong and it was only when I saw that the file > contained `usepackage[latin1]{inputenc}` that I understood where the > problem came from: changing it in `usepackage[utf8]{inputenc}` solved > it. > So, here is my request: would it be possible that, for the detection of > the real encoding of the file, AUCTeX relies not on the `inputenc` > package option, but rather on the Emacs heuristics and that, in case of > discrepancy between the two, it issues a warning? Maybe the following two commands might help. 1. C-x RET r , |If you visit a file with a wrong coding system, you can correct this | with ‘C-x r’ (‘revert-buffer-with-coding-system’). This visits | the current file again, using a coding system you specify. ` 2. C-x RET c , |Another way to specify the coding system for a file is when you visit | the file. First use the command ‘C-x c’ | (‘universal-coding-system-argument’); this command uses the minibuffer | to read a coding system name. After you exit the minibuffer, the | specified coding system is used for _the immediately following command_. | |So if the immediately following command is ‘C-x C-f’, for example, it | reads the file using that coding system (and records the coding system | for when you later save the file). Or if the immediately following | command is ‘C-x C-w’, it writes the file using that coding system. When | you specify the coding system for saving in this way, instead of with | ‘C-x f’, there is no warning if the buffer contains characters | that the coding system cannot handle. | |Other file commands affected by a specified coding system include | ‘C-x i’ and ‘C-x C-v’, as well as the other-window variants of ‘C-x | C-f’. ‘C-x c’ also affects commands that start subprocesses, | including ‘M-x shell’ (*note Shell::). If the immediately following | command does not use the coding system, then ‘C-x c’ ultimately | has no effect. ` (both quoted from info node "(emacs) Text Coding") Regards, Ikumi Keita
Re: Encoding trouble
David Kastrup writes: > Denis Bitouzé writes: > >> Le 30/01/22 à 15h52, David Kastrup a écrit : >> >>> That would be pretty annoying for people working with any Latin-x >>> encoding other than Latin-1 (or in general, any encoding not in Emacs >>> default autodetection set). >> >> In case of encoding Emacs cannot detect, AUCTeX would rely of the >> `inputenc` option. That does not even make sense since all of the Latin-x options are the same in autodetection. They cannot be distinguished since they use the same code points. Essentially, a Latin-1 user would get every Latin-x except Latin-1 displayed wrongly. And the same for Latin-2 users and so on. >>> Emacs showed you what LaTeX would have shown you. >> >> I'm not sure to see your point here. > > Where is the point in letting Emacs input display different than LaTeX > would interpret it? -- David Kastrup
Re: Encoding trouble
Denis Bitouzé writes: > Le 30/01/22 à 15h52, David Kastrup a écrit : > >> That would be pretty annoying for people working with any Latin-x >> encoding other than Latin-1 (or in general, any encoding not in Emacs >> default autodetection set). > > In case of encoding Emacs cannot detect, AUCTeX would rely of the > `inputenc` option. > >> Emacs showed you what LaTeX would have shown you. > > I'm not sure to see your point here. Where is the point in letting Emacs input display different than LaTeX would interpret it? -- David Kastrup
Re: Encoding trouble
Le 30/01/22 à 15h52, David Kastrup a écrit : > That would be pretty annoying for people working with any Latin-x > encoding other than Latin-1 (or in general, any encoding not in Emacs > default autodetection set). In case of encoding Emacs cannot detect, AUCTeX would rely of the `inputenc` option. > Emacs showed you what LaTeX would have shown you. I'm not sure to see your point here. -- Denis
Re: Encoding trouble
Denis Bitouzé writes: > Hi, > > several years ago, I already faced the following problem and, > unfortunately, it happened again yesterday, which made me lose quite > some time. > > Let me explain myself: I had a LaTeX file encoded in latin1 that > I wanted to encode in UTF-8. I used an external tool, in this case > `utrac`, which confirmed the starting (latin1) and ending (UTF-8) > encoding. But, when I opened this file in Emacs with AUCTeX enabled, the > accented characters were wrong and it was only when I saw that the file > contained `usepackage[latin1]{inputenc}` that I understood where the > problem came from: changing it in `usepackage[utf8]{inputenc}` solved > it. > > So, here is my request: would it be possible that, for the detection of > the real encoding of the file, AUCTeX relies not on the `inputenc` > package option, but rather on the Emacs heuristics and that, in case of > discrepancy between the two, it issues a warning? That would be pretty annoying for people working with any Latin-x encoding other than Latin-1 (or in general, any encoding not in Emacs default autodetection set). Emacs showed you what LaTeX would have shown you. -- David Kastrup