[AUCTeX-devel] Re: BibTeX-mode: Key generation when latin-1 characters appear in author field

2007-08-02 Thread Stefan Monnier
> Does "drop non-ascii chars" mean that "räksmörgås" becomes "rksmrgs",
> or "raksmorgas"? I'm afraid you mean the former ... But what would
> such a function do to a Greek/Cyrillic/Japanese BibTeX entry? I'd
> guess there is nothing left when you drop non-ascii chars.

Yup, there's nothing left.  So what: we're talking about a suggestion to put
in the minibuffer.


Stefan


___
auctex-devel mailing list
auctex-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/auctex-devel


[AUCTeX-devel] Re: BibTeX-mode: Key generation when latin-1 characters appear in author field

2007-08-02 Thread Christian Schlauer
Stefan Monnier <[EMAIL PROTECTED]> writes:

>>> text-mode:Grüß Gott
>>> tex-mode: Gr\"u{\ss} Gott
>>> german latex-mode:Gr"u"s Gott
>>> html-mode:Grüß Gott
>
> AFAIK, nowadays in LaTeX, you're better off using "Grüß Gott" with the
> proper input encoding.

Yes, that's what I do since I started using LaTeX 10 years ago. (But I
didn't do it in .bib files until 2003.)

I also hope that Roland's
`convert-readable-words-in-backslash-or-ampersand-escaped-sequences'
function isn't necessary anymore.

>
> ELISP> (reftex-latin1-to-ascii "räksmörgås")
>
> Before trying to solve the problem for latin-1, then latin-2, then arabic,
> then chinese, etc.. we'd better write a real fix that correctly (tho
> suboptimally) handles all cases: drop non-ascii chars.

Does "drop non-ascii chars" mean that "räksmörgås" becomes "rksmrgs",
or "raksmorgas"? I'm afraid you mean the former ... But what would
such a function do to a Greek/Cyrillic/Japanese BibTeX entry? I'd
guess there is nothing left when you drop non-ascii chars.

-- 
Christian Schlauer



___
auctex-devel mailing list
auctex-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/auctex-devel


[AUCTeX-devel] Re: BibTeX-mode: Key generation when latin-1 characters appear in author field

2007-08-02 Thread Stefan Monnier
>> text-mode:Grüß Gott
>> tex-mode: Gr\"u{\ss} Gott
>> german latex-mode:Gr"u"s Gott
>> html-mode:Grüß Gott

AFAIK, nowadays in LaTeX, you're better off using "Grüß Gott" with the
proper input encoding.  For HTML mode as well.

ELISP> (reftex-latin1-to-ascii "räksmörgås")

Before trying to solve the problem for latin-1, then latin-2, then arabic,
then chinese, etc.. we'd better write a real fix that correctly (tho
suboptimally) handles all cases: drop non-ascii chars.  Then we can add
a preprocessing function that tries to be clever.


Stefan


___
auctex-devel mailing list
auctex-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/auctex-devel


[AUCTeX-devel] Re: BibTeX-mode: Key generation when latin-1 characters appear in author field

2007-08-01 Thread Christian Schlauer
[Crossposting to gmane.emacs.auctex.devel where RefTeX is maintained
now]

The following is from a discussion in 2005, starting with
, where I wrote:

> I create the following entry in a BibTeX file -- the critical thing is
> that the name of the author contains an umlaut. (This is okay, as the
> BibTeX versions that nowadays come with TeX distributions are 8-bit
> capable, which means that I can enter umlauts in the .bib file
> directly instead of using \"o or something similar -- I just have to
> specify \usepackage[latin1]{inputenc} in the document preamble as
> well):
>
> @Article{,
>   author = {B. Blöd},
>   title =  {Test},
>   journal ={A},
>   year =   {2005},
>   OPTkey = {},
>   OPTvolume =  {},
>   OPTnumber =  {},
>   OPTpages =   {},
>   OPTmonth =   {},
>   OPTnote ={},
>   OPTannote =  {}
> }
>
> Now, press `C-c C-c' inside the entry -- Emacs suggests `blöd05:_test'
> as the key to use

... and Emacs 22.1 uses that key. I also wrote:

> It seems to me that the conversion of funny characters to ASCII
> characters is probably safer.

Roland Winkler replied in
:

> I have been thinking about this for a little while. I think, it
> goes beyond BibTeX. 
>
> Some time ago, I have written a little package umlaute.el, see
> http://www.tfkp.physik.uni-erlangen.de/~winkler/src/umlaute.el 
> The idea is to have `translation tables' so that one can go back and
> forth between different `representations' of a character, depending
> on the emacs mode, for example (I live in southern Germany)
>
>   text-mode:Grüß Gott
>   tex-mode: Gr\"u{\ss} Gott
>   german latex-mode:Gr"u"s Gott
>   html-mode:Grüß Gott
>   other modes (7 bit):  Gruess Gott
>
> BibTeX keys probably should use the 7-bit `representation'. 
> For German umlaute the 7-bit `representation' is fairly well defined
> - even though I expect that some people would prefer a translation
> table that simply drops the double dots, but does not add the extra
> `e'. The best choice might depend on the context.
> However, I do not know how this could be done for other languages.
> Some spanish people use `Señor' -> `Se~nor'. Certainly, this is not
> a good choice in the context of BibTeX keys.
>
> I do not know whether such a feature would be useful for other emacs
> packages, too, nor do I know how it could best be implemented.

FWIW, RefTeX also needs to "convert" the text of \section{} etc. in
order to create a label. So yes, there are other packages (that are
included in Emacs!) that need this, too. I played a little with the
function `reftex-latin1-to-ascii', see below:

*** Welcome to IELM ***  Type (describe-mode) for help.
ELISP> (reftex-latin1-to-ascii "räksmörgås")
"raksmorgas"
ELISP> (reftex-latin1-to-ascii "blåbærsyltetøy")
"blabarsyltetoy"
ELISP> (reftex-latin1-to-ascii "Viele Grüße")
"Viele Gru3e"
ELISP> (reftex-latin1-to-ascii "Other letters: ł œ š ñ")
"Other letters: ł œ š n"
ELISP> 

Maybe BibTeX-mode and RefTeX could share some label/key generation
code, and use `reftex-latin1-to-ascii' as a starting point?

Regards,

Christian Schlauer



___
auctex-devel mailing list
auctex-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/auctex-devel