Hi all,

Here is an implementation of the mechanism discussed here. I split the
largest part of the patch into two parts: adding the possibility to store
several characters in CharInfo, then parsing the new lines (with just two
new entries in unicodesymbols). These patches also change the names of
several methods and variables to bring them to uniformity with newer code
(with camel case).

Then, I had to patch Encodings::fromLaTeXCommand for the new entries to be
used: that code matches entries by looking up prefixes of the entered LaTeX
code until there is a match. The problem is with !`, because ! is a match.
Hence, I added a simple check: if the whole command corresponds to an entry
in unicodesymbols, use that; that doesn't fix the global issue (that would
have to be the longest prefix that matches), but it would be much more
complex to do and the added value seems lower to me.

Does it look alright to you? If so, I will push these patches.

On Mon, 14 Feb 2022 at 06:59, Jürgen Spitzmüller <sp...@lyx.org> wrote:

> Am Montag, dem 14.02.2022 um 03:24 +0100 schrieb Thibaut Cuvelier:
> > Thanks, I just did that (with a small test file): a460097823.
> >
> > However, this test showed a limitation in the current unicodesymbols:
> > there can be only one LaTeX command per symbol. This is a limitation
> > in only a few cases, like  LyX Document
> > \textexclamdown and !`: both of them are mapped to ¡ (i.e. &#161;),
> > but the file only allows for one mapping.
>
> Yes, this stems from the history of this file, which has been created
> first only for the unicode to latex route.
>
> > If we decide to solve this problem, we could have several solutions
> > (all modifying Encodings::read), I could think of two:
> > - either use a separator symbol in the latexcommand part of each
> > unicodesymbols line, but it would be hard to find a single character
> > that is never used for latexcommands
>
> yes.
>
> > - or have multiple lines for a single character, with duplicate
> > information for the second one or a simpler line format for these
> > entries. For instance, for the inverted exclamation mark:
> >
> > 0x00a1 "\\textexclamdown"         "" "force=cp862;cp1255;euc-jp;euc-
> > jp-platex;euc-kr;utf8-platex" # INVERTED EXCLAMATION MARK
> > 0x00a1 "!`" # Implicitly, all the other parameters still apply
>
> I'd also prefer this. For LaTeX output, the first occurrence should be
> preferred.
>
> > What do you think of this? Should this be done?
>
> I think it's definitely useful.
>
> Jürgen
>
>

Attachment: 0002-unicodesymbols-parse-supplementary-lines-to-encode-a.patch
Description: Binary data

Attachment: 0003-Encodings-fromLaTeXCommand-if-the-command-directly-m.patch
Description: Binary data

Attachment: 0001-CharInfo-allow-to-store-several-commands-both-text-a.patch
Description: Binary data

Attachment: 0004-unicodesymbols-add-several-synonyms.patch
Description: Binary data

-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel

Reply via email to