Re: Encoding trouble

2022-02-01 Thread Janusz S . Bień
On Tue, Feb 01 2022 at 12:08 +01, Arash Esbati wrote:
> Denis Bitouzé  writes:
>
>> several years ago, I already faced the following problem and,
>> unfortunately, it happened again yesterday, which made me lose quite
>> some time.
>>
>> Let me explain myself: I had a LaTeX file encoded in latin1 that
>> I wanted to encode in UTF-8. I used an external tool, in this case
>> `utrac`,
>
> I think Emacs got upset because it saw you used an external tool --
> hence the punishment 

What about local variables?

All my files have

%%% Local Variables: 
%%% coding: utf-8-unix
%%% mode: latex
%%% TeX-master: t
%%% TeX-PDF-mode: t
%%% TeX-engine: xetex
%%% End: 

Regards - Janusz

-- 
 ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



Re: Encoding trouble

2022-02-01 Thread Arash Esbati
Denis Bitouzé  writes:

> several years ago, I already faced the following problem and,
> unfortunately, it happened again yesterday, which made me lose quite
> some time.
>
> Let me explain myself: I had a LaTeX file encoded in latin1 that
> I wanted to encode in UTF-8. I used an external tool, in this case
> `utrac`,

I think Emacs got upset because it saw you used an external tool --
hence the punishment 

> which confirmed the starting (latin1) and ending (UTF-8)
> encoding. But, when I opened this file in Emacs with AUCTeX enabled,
> the accented characters were wrong and it was only when I saw that the
> file contained `usepackage[latin1]{inputenc}` that I understood where
> the problem came from: changing it in `usepackage[utf8]{inputenc}`
> solved it.

I'm not even sure that AUCTeX has code to deal with a .tex file which
already contains \usepackage[]{inputenc}.  The only thing I'm aware
of is when you type 'C-c C-m usepackage RET inputenc RET ENC RET' that
AUCTeX changes the file encoding acc. to chose ENC.

> So, here is my request: would it be possible that, for the detection of
> the real encoding of the file, AUCTeX relies not on the `inputenc`
> package option, but rather on the Emacs heuristics and that, in case of
> discrepancy between the two, it issues a warning?

Have a look at the variable `file-coding-system-alist'.  If you don't
want automatic conversion based on 'inputenc', remove the entry

 ("\\.\\(tex\\|ltx\\|dtx\\|drv\\)\\'" . latexenc-find-file-coding-system)

Maybe that helps.

Best, Arash




Re: Encoding trouble

2022-02-01 Thread Ikumi Keita
Hi Denis,

> Denis Bitouzé  writes:
> Let me explain myself: I had a LaTeX file encoded in latin1 that
> I wanted to encode in UTF-8. I used an external tool, in this case
> `utrac`, which confirmed the starting (latin1) and ending (UTF-8)
> encoding. But, when I opened this file in Emacs with AUCTeX enabled, the
> accented characters were wrong and it was only when I saw that the file
> contained `usepackage[latin1]{inputenc}` that I understood where the
> problem came from: changing it in `usepackage[utf8]{inputenc}` solved
> it.

> So, here is my request: would it be possible that, for the detection of
> the real encoding of the file, AUCTeX relies not on the `inputenc`
> package option, but rather on the Emacs heuristics and that, in case of
> discrepancy between the two, it issues a warning?

Maybe the following two commands might help.
1. C-x RET r
,
|If you visit a file with a wrong coding system, you can correct this
| with ‘C-x  r’ (‘revert-buffer-with-coding-system’).  This visits
| the current file again, using a coding system you specify.
`

2. C-x RET c
,
|Another way to specify the coding system for a file is when you visit
| the file.  First use the command ‘C-x  c’
| (‘universal-coding-system-argument’); this command uses the minibuffer
| to read a coding system name.  After you exit the minibuffer, the
| specified coding system is used for _the immediately following command_.
| 
|So if the immediately following command is ‘C-x C-f’, for example, it
| reads the file using that coding system (and records the coding system
| for when you later save the file).  Or if the immediately following
| command is ‘C-x C-w’, it writes the file using that coding system.  When
| you specify the coding system for saving in this way, instead of with
| ‘C-x  f’, there is no warning if the buffer contains characters
| that the coding system cannot handle.
| 
|Other file commands affected by a specified coding system include
| ‘C-x i’ and ‘C-x C-v’, as well as the other-window variants of ‘C-x
| C-f’.  ‘C-x  c’ also affects commands that start subprocesses,
| including ‘M-x shell’ (*note Shell::).  If the immediately following
| command does not use the coding system, then ‘C-x  c’ ultimately
| has no effect.
`
(both quoted from info node "(emacs) Text Coding")

Regards,
Ikumi Keita



Re: Encoding trouble

2022-01-30 Thread David Kastrup
David Kastrup  writes:

> Denis Bitouzé  writes:
>
>> Le 30/01/22 à 15h52, David Kastrup a écrit :
>>
>>> That would be pretty annoying for people working with any Latin-x
>>> encoding other than Latin-1 (or in general, any encoding not in Emacs
>>> default autodetection set).
>>
>> In case of encoding Emacs cannot detect, AUCTeX would rely of the
>> `inputenc` option.

That does not even make sense since all of the Latin-x options are the
same in autodetection.  They cannot be distinguished since they use the
same code points.

Essentially, a Latin-1 user would get every Latin-x except Latin-1
displayed wrongly.  And the same for Latin-2 users and so on.

>>> Emacs showed you what LaTeX would have shown you.
>>
>> I'm not sure to see your point here.
>
> Where is the point in letting Emacs input display different than LaTeX
> would interpret it?

-- 
David Kastrup



Re: Encoding trouble

2022-01-30 Thread David Kastrup
Denis Bitouzé  writes:

> Le 30/01/22 à 15h52, David Kastrup a écrit :
>
>> That would be pretty annoying for people working with any Latin-x
>> encoding other than Latin-1 (or in general, any encoding not in Emacs
>> default autodetection set).
>
> In case of encoding Emacs cannot detect, AUCTeX would rely of the
> `inputenc` option.
>
>> Emacs showed you what LaTeX would have shown you.
>
> I'm not sure to see your point here.

Where is the point in letting Emacs input display different than LaTeX
would interpret it?

-- 
David Kastrup



Re: Encoding trouble

2022-01-30 Thread Denis Bitouzé
Le 30/01/22 à 15h52, David Kastrup a écrit :

> That would be pretty annoying for people working with any Latin-x
> encoding other than Latin-1 (or in general, any encoding not in Emacs
> default autodetection set).

In case of encoding Emacs cannot detect, AUCTeX would rely of the
`inputenc` option.

> Emacs showed you what LaTeX would have shown you.

I'm not sure to see your point here.
-- 
Denis



Re: Encoding trouble

2022-01-30 Thread David Kastrup
Denis Bitouzé  writes:

> Hi,
>
> several years ago, I already faced the following problem and,
> unfortunately, it happened again yesterday, which made me lose quite
> some time.
>
> Let me explain myself: I had a LaTeX file encoded in latin1 that
> I wanted to encode in UTF-8. I used an external tool, in this case
> `utrac`, which confirmed the starting (latin1) and ending (UTF-8)
> encoding. But, when I opened this file in Emacs with AUCTeX enabled, the
> accented characters were wrong and it was only when I saw that the file
> contained `usepackage[latin1]{inputenc}` that I understood where the
> problem came from: changing it in `usepackage[utf8]{inputenc}` solved
> it.
>
> So, here is my request: would it be possible that, for the detection of
> the real encoding of the file, AUCTeX relies not on the `inputenc`
> package option, but rather on the Emacs heuristics and that, in case of
> discrepancy between the two, it issues a warning?

That would be pretty annoying for people working with any Latin-x
encoding other than Latin-1 (or in general, any encoding not in Emacs
default autodetection set).

Emacs showed you what LaTeX would have shown you.

-- 
David Kastrup