Re: [updated PATCH] bugs 3958, 3313 and 3976 lyx2lyx and unicode characters
On Friday 13 July 2007 13:16:46 Anders Ekberg wrote: > I have tried to address Georg's and Juergen's comments. > To avoid data-loss, the function is only run if the encoding is auto > or default and there are no language changes (overly conservative, > but possible to work around, as commented in the code). Now, I > *assume* this should prevent any cases that can cause data-loss (i.e. > assuming unicode encoding when it is not), but I don't know the > format good enough to be sure, so please correct the code if I'm wrong. We can extend the test to see if the included languages have the same encoding... > To avoid the need for two unicodesymbols files, I have removed that > requirement by assuming any command that includes { and } and not any > of the exception tokens, to be an accented character. I also check > for combining characters and don't translate these (which allowed the > removal of an if-test in the end of the code). It is not pretty, but > works... > > Details and an idea for a (hopefully) better solution are in bugzilla > http://bugzilla.lyx.org/show_bug.cgi?id=3958 As I said there the format change needs to be in lyx_1_5.py and not in LyX.py. > Anders -- José Abílio
Re: [updated PATCH] bugs 3958, 3313 and 3976 lyx2lyx and unicode characters
A hopefully better solution is posted to bugzilla. Details in http://bugzilla.lyx.org/show_bug.cgi?id=3958 /Anders Anders Ekberg Fri, 13 Jul 2007 05:18:45 -0700 I have tried to address Georg's and Juergen's comments. To avoid data-loss, the function is only run if the encoding is auto or default and there are no language changes (overly conservative, but possible to work around, as commented in the code). Now, I *assume* this should prevent any cases that can cause data-loss (i.e. assuming unicode encoding when it is not), but I don't know the format good enough to be sure, so please correct the code if I'm wrong. To avoid the need for two unicodesymbols files, I have removed that requirement by assuming any command that includes { and } and not any of the exception tokens, to be an accented character. I also check for combining characters and don't translate these (which allowed the removal of an if-test in the end of the code). It is not pretty, but works... Details and an idea for a (hopefully) better solution are in bugzilla http://bugzilla.lyx.org/show_bug.cgi?id=3958
[updated PATCH] bugs 3958, 3313 and 3976 lyx2lyx and unicode characters
I have tried to address Georg's and Juergen's comments. To avoid data-loss, the function is only run if the encoding is auto or default and there are no language changes (overly conservative, but possible to work around, as commented in the code). Now, I *assume* this should prevent any cases that can cause data-loss (i.e. assuming unicode encoding when it is not), but I don't know the format good enough to be sure, so please correct the code if I'm wrong. To avoid the need for two unicodesymbols files, I have removed that requirement by assuming any command that includes { and } and not any of the exception tokens, to be an accented character. I also check for combining characters and don't translate these (which allowed the removal of an if-test in the end of the code). It is not pretty, but works... Details and an idea for a (hopefully) better solution are in bugzilla http://bugzilla.lyx.org/show_bug.cgi?id=3958 Anders patch Description: Binary data
Re: [PATCH] bugs 3958, 3313 and 3976 lyx2lyx and unicode characters
On 9 jul 2007, at 14.15, Jean-Marc Lasgouttes wrote: "Anders" == Anders Ekberg <[EMAIL PROTECTED]> writes: Anders> The patch addresses this by excluding all accented characters Anders> from the list that revert_unicode processes. In principle this Anders> results in that accented characters get "properly" translated Anders> and remaining unicode characters are replaced by ERT or math Anders> commands (if they are in the list of characters). What is the different between reverse_unicode and revert accent? Is it just generating ERT versus InsetLaTexAccent? Basically yes (ERT or math inset) In this case, lyx2lyx could look at the generated string and decide what to do? You mean merge the two? That would of course be the best solution, but that requires someone with better knowledge of revert_accent and the LyX-format (and time ;-) than I have. Otherwise I think it would be quite straightforward to merge reverse_unicode. Just read in the unicodesymbols file, keep track of whether you're in an inset (and which) and if you include a math or an ERT. Then get the corresponding replacement string. /Anders
Re: [PATCH] bugs 3958, 3313 and 3976 lyx2lyx and unicode characters
> "Anders" == Anders Ekberg <[EMAIL PROTECTED]> writes: Anders> The patch addresses this by excluding all accented characters Anders> from the list that revert_unicode processes. In principle this Anders> results in that accented characters get "properly" translated Anders> and remaining unicode characters are replaced by ERT or math Anders> commands (if they are in the list of characters). What is the different between reverse_unicode and revert accent? Is it just generating ERT versus InsetLaTexAccent? In this case, lyx2lyx could look at the generated string and decide what to do? JMarc
Re: [PATCH] bugs 3958, 3313 and 3976 lyx2lyx and unicode characters
On 9 jul 2007, at 11.36, Jürgen Spitzmüller wrote: Anders Ekberg wrote: And then we have to maintain two unicodesymbols lists? I do not think this is a good idea. I know, I tried some different options (like searching for {...}), but didn't come up with anything I thought was better. So as a compromise, I kept the lists identical and just commented out the accented characters. If there is little maintenance of the unicodesymbols list, I think it is acceptable (you can do a diff to fairly easy spott errors). I think lots of symbols will be added soon after 1.5.0. The need to always snchronize the two lists is inefficient and error-prone. How about adding a new flag "revert" to the unicodesymbols list instead? I thought about that too, but was afraid this would mess things up in other places. Where? I assumed in the conversion to TeX. But I don't know anything about that, so I took what I thought was the safest route. Anders
Re: [PATCH] bugs 3958, 3313 and 3976 lyx2lyx and unicode characters
Anders Ekberg wrote: > > And then we have to maintain two unicodesymbols lists? I do not > > think this is > > a good idea. > > I know, I tried some different options (like searching for {...}), > but didn't come up with anything I thought was better. So as a > compromise, I kept the lists identical and just commented out the > accented characters. If there is little maintenance of the > unicodesymbols list, I think it is acceptable (you can do a diff to > fairly easy spott errors). I think lots of symbols will be added soon after 1.5.0. The need to always snchronize the two lists is inefficient and error-prone. > > How about adding a new flag "revert" to the unicodesymbols list > > instead? > > I thought about that too, but was afraid this would mess things up in > other places. Where? Jürgen
Re: [PATCH] bugs 3958, 3313 and 3976 lyx2lyx and unicode characters
Jürgen Spitzmüller Mon, 09 Jul 2007 02:05:27 -0700 Anders Ekberg wrote: > The patch addresses this by excluding all accented characters from > the list that revert_unicode processes. In principle this results in > that accented characters get "properly" translated and remaining > unicode characters are replaced by ERT or math commands (if they are > in the list of characters). And then we have to maintain two unicodesymbols lists? I do not think this is a good idea. I know, I tried some different options (like searching for {...}), but didn't come up with anything I thought was better. So as a compromise, I kept the lists identical and just commented out the accented characters. If there is little maintenance of the unicodesymbols list, I think it is acceptable (you can do a diff to fairly easy spott errors). How about adding a new flag "revert" to the unicodesymbols list instead? I thought about that too, but was afraid this would mess things up in other places. /Anders
Re: [PATCH] bugs 3958, 3313 and 3976 lyx2lyx and unicode characters
Anders Ekberg wrote: > The patch addresses this by excluding all accented characters from > the list that revert_unicode processes. In principle this results in > that accented characters get "properly" translated and remaining > unicode characters are replaced by ERT or math commands (if they are > in the list of characters). And then we have to maintain two unicodesymbols lists? I do not think this is a good idea. How about adding a new flag "revert" to the unicodesymbols list instead? Jürgen