On 2013-03-31, Georg Baum wrote: > Guenter Milde wrote: >> On 2013-03-27, Georg Baum wrote:
... >> Without file inclusions, the "LaTeX encoding" of the exported file does >> not matter for the Postscript/PDF-generation: >> * The encoding of the LyX-document itself is always utf8 (since several >> versions of LyX). >> * (re) import into LyX converts the from the "LaTeX encoding" into utf8. >> * With 8-bit LaTeX, every non-ASCII character is converted to LICRs >> (either by LyX (if the encoding is set to ASCII, or by the inputenc >> package). >> * With (Xe/LuaLaTeX), the LaTeX encoding is always utf8. > In theory you are right. In practice this is not always the case for 8-bit > LaTeX. Either you choose utf8 encoding. In that case you need to load any of > the existing utf8 support packages like utf8 or utf8x, but none of them is > complete. Also, listings do not work with utf8. Or you choose any other > encoding, but in that case you rely on lib/unicodesymbols, which is > incomplete as well, and may even load packages that are incompatible to each > other. I don't know about CJK and other asian languages, but for Latin, Greek, and Cyrillic, lib/unicodesymbols is more complete than any LaTeX inputenc file. The problem of incompatible packages needs to be resolved, but again this would not go away using one of the existing "LaTeX encodings" (let alone a mix of them). The "force" flag in lib/unicodesymbols provides a workaround for the incomplete translation in inputen's utf8. utf8x is non-standard and unsupported and should only be used by users that know the dangers and incompatibilities. I agree with "listings". Does it work with 8-bit encodings? >> For included files it is IMO quite sensible to assume the locale encoding >> as a first guess. If the "LaTeX encoding" and the locale encoding are the >> same, chances are best that no re-encoding is required. > Why is it sensible to choose the locale encoding? This assumes that the > document language matches the locale, but this is an invalid assumption > IMHO, as I tried to explain. I know that many text editors assume that as > default, but that does not make it better. For LaTeX documents, there is no requirement that the encoding matches a language default. All characters can be represented in the LaTeX internal character representation (LICR), a pure ASCII encoding using a combination of accent- and character-macros. Both, inputenc's *.enc files and the translations in lib/unicodesymbols transform into LICRs, so the 8-bit encoding default specified in "lib/languages" is merely for convenience (and from a pre-utf8 time). The encoding default of the OS can be assumed to be the encoding of the majority of files on the system. Hence, this choice would minimize problems with included files. ... >> This is why, the current default (language-dependent multi-encoding) >> is an outdated and very bad choice. It was justified to a certain degree >> when LyX still used 8-bit encodings for the *.lyx file itself but this is >> now several years ago. > I do not agree. The main purpose of the exported LaTeX is not to be edited > with a text editor, but to be typeset. For the latter purpose the default is > IMHO still the best one, at least as long as utf8 support is as limited as > nowadays in 8-bit LaTeX. There are several purposes of exported LaTeX. Of course, the "internally" exported file will be typset directly. Explicit export (File>Export>...) is done for either storing in a more generic format, post-processing, sharing with non-LyX co-workers, etc. In all these use cases, a mixed encoding is rather an annoyance than a help. If not utf8, we should at least use a consistent encoding, either the main document language's default 8-bit encoding or ASCII. ... >> Considering, that UTF-8 is nowadays the default on most systems, I'd >> recommend to change the default. However, only after bug >> http://www.lyx.org/trac/ticket/8600 >> is fixed. > There are more, http://www.lyx.org/trac/ticket/8408 and > http://www.lyx.org/trac/ticket/6789. The latter considers a problem with utf8x (ucs) and does not affect the standard utf8 (inputenc). Günter
