Re: Strange feature
Jean-Marc Lasgouttes [EMAIL PROTECTED] writes: | "Lars" == Lars Gullik Bjønnes [EMAIL PROTECTED] writes: | | Lars The escaping was not just bad, it was plain wrong. run lyx with | Lars -dbg key,keymap and you can see yourself. Why demand escaping | Lars when it is not needed. | | The escaping was right with LyXLex, AFAIK. With lyxlex it didn't matter since the result was not used anyway. | Sure, if I add a space at the beginning of the line for some reason, | it does not get parsed. On the orhet hand, I can add any junk at the | end of the line, and it will be ignored. Talk about robustness... I | know the regexp can be made better, Sure it can. "^[ \t]*([12][0-9][0-9])[ \t]+\"([^ ]+)\"[ \t]+.*" | | Lars | - the other cdef files have not been modified, and thus the | Lars problem | remain for them | | Lars Note that the actual contents of the .cdef files have not been | Lars used until I made my changes yesterday. | | Yes, but they were used before you ditched lyxlex, I believe. No they were not. At least not in 1.0.4, just look at CharacterSet::encodeString in chset.C it returns true if we have a str matching one of the strings in the cmap, the number is never used. | And maybe is it just that I do not know regexp well enough, but how | are you sure that the match for a string like "\"{e}" will not be just | "\"? Are _all_ regexp implementations required to do this? Yes, a regex will alway try to match as much as possible. | | Lars | - If you do not like those escape in the old files, we can in | Lars fact | just forget about the quotes and it should remove the | Lars need for | quoting (does it? if not this is a bug in lyxlex). | | Lars Yes that was one of my comments, why use two other chars as | Lars delimiters. | | What I definitely do not like about your solution is that cdef file do | not have a syntax anymore (unless you count `anything that matches my | regexp' as a syntax). In fact, the only way to document the syntax is | probably to give the regexp... Syntax: nnn "tex-char" I agree that "" looks a bit weird when you allow " inside to. \def{tex-char}{nnn} could be an alternative | Lars Wrong use of lyxlex. To use lyxlex as a glorified tokenizer is | Lars not right. | | Why? A tokenizer is just what is needed here. If you want nice tokens, | just modify the syntax to be | Chardef 192 "\\`{A}" | and you have nice tokens to play with... How will that be any nicer? Then lyxlex have one item in its keyword table, and you still have to "manually" parse the rest of the line. I really think that to avoid the escaping is nice. Lgb
Re: Strange feature
"Lars" == Lars Gullik Bjønnes [EMAIL PROTECTED] writes: Lars Sure it can. "^[ \t]*([12][0-9][0-9])[ \t]+\"([^ ]+)\"[ \t]+.*" Wouldn't something like "^[ \t]*([12][0-9][0-9])[ \t]+\"(.*)\"[ \t]*$" be better? It avoids junk at the end, does not force a space after the final ". Lars No they were not. At least not in 1.0.4, just look at Lars CharacterSet::encodeString in chset.C it returns true if we have Lars a str matching one of the strings in the cmap, the number is Lars never used. Yes, I saw that, but I though that it was used elsewhere I did not see. What is the theory of how this stuff is used, anyway? Lars \def{tex-char}{nnn} Lars could be an alternative Since nnn almost surely contains braces (\'{e}), this would not be better. Moreover, this would use a false tex syntax that does not make sense in TeX. We could want to use the syntax of LaTeX inputenc files, though, and directly parse that... Lars How will that be any nicer? Then lyxlex have one item in its Lars keyword table, and you still have to "manually" parse the rest Lars of the line. Except that the 'manually' part does certainly not use more lines of code than the smart regexp code. Lars I really think that to avoid the escaping is nice. Well, you can have that with LyXLex if you do not quote the second argument.
Re: Strange "feature"
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | | Lars> The escaping was not just bad, it was plain wrong. run lyx with | Lars> -dbg key,keymap and you can see yourself. Why demand escaping | Lars> when it is not needed. | | The escaping was right with LyXLex, AFAIK. With lyxlex it didn't matter since the result was not used anyway. | Sure, if I add a space at the beginning of the line for some reason, | it does not get parsed. On the orhet hand, I can add any junk at the | end of the line, and it will be ignored. Talk about robustness... I | know the regexp can be made better, Sure it can. "^[ \t]*([12][0-9][0-9])[ \t]+\"([^ ]+)\"[ \t]+.*" | | Lars> | - the other cdef files have not been modified, and thus the | Lars> problem | remain for them | | Lars> Note that the actual contents of the .cdef files have not been | Lars> used until I made my changes yesterday. | | Yes, but they were used before you ditched lyxlex, I believe. No they were not. At least not in 1.0.4, just look at CharacterSet::encodeString in chset.C it returns true if we have a str matching one of the strings in the cmap, the number is never used. | And maybe is it just that I do not know regexp well enough, but how | are you sure that the match for a string like "\"{e}" will not be just | "\"? Are _all_ regexp implementations required to do this? Yes, a regex will alway try to match as much as possible. | | Lars> | - If you do not like those escape in the old files, we can in | Lars> fact | just forget about the quotes and it should remove the | Lars> need for | quoting (does it? if not this is a bug in lyxlex). | | Lars> Yes that was one of my comments, why use two other chars as | Lars> delimiters. | | What I definitely do not like about your solution is that cdef file do | not have a syntax anymore (unless you count `anything that matches my | regexp' as a syntax). In fact, the only way to document the syntax is | probably to give the regexp... Syntax: nnn "" I agree that "" looks a bit weird when you allow " inside to. \def{}{nnn} could be an alternative | Lars> Wrong use of lyxlex. To use lyxlex as a glorified tokenizer is | Lars> not right. | | Why? A tokenizer is just what is needed here. If you want nice tokens, | just modify the syntax to be | Chardef 192 "\\`{A}" | and you have nice tokens to play with... How will that be any nicer? Then lyxlex have one item in its keyword table, and you still have to "manually" parse the rest of the line. I really think that to avoid the escaping is nice. Lgb
Re: Strange "feature"
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> Sure it can. "^[ \t]*([12][0-9][0-9])[ \t]+\"([^ ]+)\"[ \t]+.*" Wouldn't something like "^[ \t]*([12][0-9][0-9])[ \t]+\"(.*)\"[ \t]*$" be better? It avoids junk at the end, does not force a space after the final ". Lars> No they were not. At least not in 1.0.4, just look at Lars> CharacterSet::encodeString in chset.C it returns true if we have Lars> a str matching one of the strings in the cmap, the number is Lars> never used. Yes, I saw that, but I though that it was used elsewhere I did not see. What is the theory of how this stuff is used, anyway? Lars> \def{}{nnn} Lars> could be an alternative Since nnn almost surely contains braces (\'{e}), this would not be better. Moreover, this would use a false tex syntax that does not make sense in TeX. We could want to use the syntax of LaTeX inputenc files, though, and directly parse that... Lars> How will that be any nicer? Then lyxlex have one item in its Lars> keyword table, and you still have to "manually" parse the rest Lars> of the line. Except that the 'manually' part does certainly not use more lines of code than the smart regexp code. Lars> I really think that to avoid the escaping is nice. Well, you can have that with LyXLex if you do not quote the second argument.
Re: Strange feature
Jean-Marc Lasgouttes [EMAIL PROTECTED] writes: | "Jean-Marc" == Jean-Marc Lasgouttes [EMAIL PROTECTED] writes: | | Jean-Marc After debugging this a bit, it seems that you broke the | Jean-Marc parsing of cdef files with the new regexp stuff. The | Jean-Marc problem is that you do not handle backslash escapes, and | Jean-Marc \\\"{o} is kept as is in chset map, instead of transforming | Jean-Marc it to \"{o}. | | Lars, I see that you decided that escaping was bad and changed | iso8859-1.cdef to reflect what you think is good. However, I see | several problems: The escaping was not just bad, it was plain wrong. run lyx with -dbg key,keymap and you can see yourself. Why demand escaping when it is not needed. | - the concerns I have with the non robustness of your regexp-based | parser remain. I understand that regexps are fun, but... if the line match it match, else it does not match. As for robustness, it does not crash. | - the other cdef files have not been modified, and thus the problem | remain for them Note that the actual contents of the .cdef files have not been used until I made my changes yesterday. | - the current syntax does not make sense, since we have non-escaped " | characters inside "" groups. So? that matches ' "[^ ]" ' | - If you do not like those escape in the old files, we can in fact | just forget about the quotes and it should remove the need for | quoting (does it? if not this is a bug in lyxlex). Yes that was one of my comments, why use two other chars as delimiters. | Why not just revert to the old parser? Wrong use of lyxlex. To use lyxlex as a glorified tokenizer is not right. Lgb
Re: Strange "feature"
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Jean-Marc" == Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | | Jean-Marc> After debugging this a bit, it seems that you broke the | Jean-Marc> parsing of cdef files with the new regexp stuff. The | Jean-Marc> problem is that you do not handle backslash escapes, and | Jean-Marc> \\\"{o} is kept as is in chset map, instead of transforming | Jean-Marc> it to \"{o}. | | Lars, I see that you decided that escaping was bad and changed | iso8859-1.cdef to reflect what you think is good. However, I see | several problems: The escaping was not just bad, it was plain wrong. run lyx with -dbg key,keymap and you can see yourself. Why demand escaping when it is not needed. | - the concerns I have with the non robustness of your regexp-based | parser remain. I understand that regexps are fun, but... if the line match it match, else it does not match. As for robustness, it does not crash. | - the other cdef files have not been modified, and thus the problem | remain for them Note that the actual contents of the .cdef files have not been used until I made my changes yesterday. | - the current syntax does not make sense, since we have non-escaped " | characters inside "" groups. So? that matches ' "[^ ]" ' | - If you do not like those escape in the old files, we can in fact | just forget about the quotes and it should remove the need for | quoting (does it? if not this is a bug in lyxlex). Yes that was one of my comments, why use two other chars as delimiters. | Why not just revert to the old parser? Wrong use of lyxlex. To use lyxlex as a glorified tokenizer is not right. Lgb
Re: Strange feature
"Lars" == Lars Gullik Bjønnes [EMAIL PROTECTED] writes: Lars Is this also treu after reading the file again? What keynap are Lars you using? Lars Seems like Lyx does not think that this char can be directly Lars showed/dispplayed, and used a InsetLatexInset to show it. After debugging this a bit, it seems that you broke the parsing of cdef files with the new regexp stuff. The problem is that you do not handle backslash escapes, and \\\"{o} is kept as is in chset map, instead of transforming it to \"{o}. It does not seem to me that the new parsing is significantly better that plain old lyxlex... Moreover, your regexp "^([12][0-9][0-9])[ \t]+\"([^ ]+)\".*" does not allow for spaces at the beginning of a line, but allows any junk at the end of a line... And any wrongly formed line is silently ignored as if it were a comment. Frankly, I think the old code was much better (at least for _this_ particular file: there might be advantages to regexps in other cases). If you really like regexps, you should design a regexp-based parse class, which handles all the robustness matters. JMarc
Re: Strange feature
Lars Is this also treu after reading the file again? What keynap are This is about the third or fourth mail from Lars that gets cited in the list although I did not receive the original. Since the question looks like Lars asked me (it was my problem) I should have received at least one copy, shouldn't I? Any idea? I am pretty sure Lars is not in any kind of killfile here... Andre'
Re: Strange feature
"Andre'" == Andre' Poenitz [EMAIL PROTECTED] writes: Lars Is this also treu after reading the file again? What keynap are Andre' This is about the third or fourth mail from Lars that gets Andre' cited in the list although I did not receive the original. Andre' Since the question looks like Lars asked me (it was my Andre' problem) I should have received at least one copy, shouldn't Andre' I? Andre' Any idea? I am pretty sure Lars is not in any kind of killfile Andre' here... That's strange... The message is correctly archived at http://www.mail-archive.com/lyx-devel@lists.lyx.org/msg08012.html I append the header of the message to this message. The only notable thing is that it has been sent to lyx-devel but not CC'd to you (which is rather a good idea). JMarc Mail-from: From [EMAIL PROTECTED] Wed Dec 15 02:04:03 1999 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by fantomas.inria.fr (8.8.5/8.8.7) with ESMTP id CAA09638 for [EMAIL PROTECTED]; Wed, 15 Dec 1999 02:04:03 +0100 (MET) Received: from wierdlmpc.msci.memphis.edu (wierdlmpc.msci.memphis.edu [141.225.11 .87]) by nez-perce.inria.fr (8.8.7/8.8.7) with SMTP id CAA08816 for [EMAIL PROTECTED]; Wed, 15 Dec 1999 02:04:02 +0100 (MET ) Received: (qmail 14883 invoked by uid 514); 15 Dec 1999 01:04:12 - Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm Precedence: bulk X-No-Archive: yes List-Unsubscribe: mailto:lyx-devel-unsubscribe-Jean-Marc.Lasgouttes=inria.fr@lis ts.lyx.org Delivered-To: mailing list [EMAIL PROTECTED] Received: (qmail 14873 invoked from network); 15 Dec 1999 01:04:11 - To: [EMAIL PROTECTED] Subject: Re: Strange "feature" References: [EMAIL PROTECTED] From: [EMAIL PROTECTED] (Lars Gullik Bjønnes) Date: 15 Dec 1999 02:03:46 +0100 In-Reply-To: [EMAIL PROTECTED] Message-ID: [EMAIL PROTECTED] X-Mailer: Gnus v5.5/Emacs 20.3 Lines: 20 Xref: fantomas.inria.fr lyx-devel:1968 X-Gnus-Newsgroup: lyx-devel:1968 Wed Dec 15 10:31:15 1999
Re: Strange feature
"Andre' Poenitz" [EMAIL PROTECTED] writes: | Lars Is this also treu after reading the file again? What keynap are | | | This is about the third or fourth mail from Lars that gets cited in the | list although I did not receive the original. Since the question looks | like Lars asked me (it was my problem) I should have received at least | one copy, shouldn't I? | | Any idea? I am pretty sure Lars is not in any kind of killfile here... Please forward this to Andre'... The mailserver at your site does not allow 8bit characters in mail heaaders, althogh this is strictly correct in regard to the RFC (822?) it is not good in the real world, and there is now really good reason do disallow 8bit characters in mail headers, pester your mailadmins to upgrade the mailserver. (may name is my name is my name) Lgb
Re: Strange "feature"
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> Is this also treu after reading the file again? What keynap are Lars> you using? Lars> Seems like Lyx does not think that this char can be directly Lars> showed/dispplayed, and used a InsetLatexInset to show it. After debugging this a bit, it seems that you broke the parsing of cdef files with the new regexp stuff. The problem is that you do not handle backslash escapes, and \\\"{o} is kept as is in chset map, instead of transforming it to \"{o}. It does not seem to me that the new parsing is significantly better that plain old lyxlex... Moreover, your regexp "^([12][0-9][0-9])[ \t]+\"([^ ]+)\".*" does not allow for spaces at the beginning of a line, but allows any junk at the end of a line... And any wrongly formed line is silently ignored as if it were a comment. Frankly, I think the old code was much better (at least for _this_ particular file: there might be advantages to regexps in other cases). If you really like regexps, you should design a regexp-based parse class, which handles all the robustness matters. JMarc
Re: Strange "feature"
> Lars> Is this also treu after reading the file again? What keynap are > This is about the third or fourth mail from Lars that gets cited in the list although I did not receive the original. Since the question looks like Lars asked me (it was my problem) I should have received at least one copy, shouldn't I? Any idea? I am pretty sure Lars is not in any kind of killfile here... Andre'
Re: Strange "feature"
>>>>> "Andre'" == Andre' Poenitz <[EMAIL PROTECTED]> writes: Lars> Is this also treu after reading the file again? What keynap are >> Andre'> This is about the third or fourth mail from Lars that gets Andre'> cited in the list although I did not receive the original. Andre'> Since the question looks like Lars asked me (it was my Andre'> problem) I should have received at least one copy, shouldn't Andre'> I? Andre'> Any idea? I am pretty sure Lars is not in any kind of killfile Andre'> here... That's strange... The message is correctly archived at http://www.mail-archive.com/lyx-devel@lists.lyx.org/msg08012.html I append the header of the message to this message. The only notable thing is that it has been sent to lyx-devel but not CC'd to you (which is rather a good idea). JMarc Mail-from: From [EMAIL PROTECTED] Wed Dec 15 02:04:03 1999 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by fantomas.inria.fr (8.8.5/8.8.7) with ESMTP id CAA09638 for <[EMAIL PROTECTED]>; Wed, 15 Dec 1999 02:04:03 +0100 (MET) Received: from wierdlmpc.msci.memphis.edu (wierdlmpc.msci.memphis.edu [141.225.11 .87]) by nez-perce.inria.fr (8.8.7/8.8.7) with SMTP id CAA08816 for <[EMAIL PROTECTED]>; Wed, 15 Dec 1999 02:04:02 +0100 (MET ) Received: (qmail 14883 invoked by uid 514); 15 Dec 1999 01:04:12 - Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm Precedence: bulk X-No-Archive: yes List-Unsubscribe: <mailto:lyx-devel-unsubscribe-Jean-Marc.Lasgouttes=inria.fr@lis ts.lyx.org> Delivered-To: mailing list [EMAIL PROTECTED] Received: (qmail 14873 invoked from network); 15 Dec 1999 01:04:11 - To: [EMAIL PROTECTED] Subject: Re: Strange "feature" References: <[EMAIL PROTECTED]> From: [EMAIL PROTECTED] (Lars Gullik Bjønnes) Date: 15 Dec 1999 02:03:46 +0100 In-Reply-To: <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> X-Mailer: Gnus v5.5/Emacs 20.3 Lines: 20 Xref: fantomas.inria.fr lyx-devel:1968 X-Gnus-Newsgroup: lyx-devel:1968 Wed Dec 15 10:31:15 1999
Re: Strange "feature"
"Andre' Poenitz" <[EMAIL PROTECTED]> writes: | > Lars> Is this also treu after reading the file again? What keynap are | > | | This is about the third or fourth mail from Lars that gets cited in the | list although I did not receive the original. Since the question looks | like Lars asked me (it was my problem) I should have received at least | one copy, shouldn't I? | | Any idea? I am pretty sure Lars is not in any kind of killfile here... Please forward this to Andre'... The mailserver at your site does not allow 8bit characters in mail heaaders, althogh this is strictly correct in regard to the RFC (822?) it is not good in the real world, and there is now really good reason do disallow 8bit characters in mail headers, pester your mailadmins to upgrade the mailserver. (may name is my name is my name) Lgb
Re: Strange feature
"Andre' Poenitz" [EMAIL PROTECTED] writes: | 1.1.4cvs, option-keyboard german. Encoding latin1. | | When pressing ö (this funny german letter with two dots above an o) | I get something that looks like two dots above an o. | | Well, no big deal you might think, that's what it is supposed to do. | | But: The dots are far higher than 'normal' and it does not look like a | single character at all. Indeed, when looking at the .lyx file there | is a \i \"{o} instead of ö in the file. Is this also treu after reading the file again? What keynap are you using? Seems like Lyx does not think that this char can be directly showed/dispplayed, and used a InsetLatexInset to show it. Lgb
Re: Strange "feature"
"Andre' Poenitz" <[EMAIL PROTECTED]> writes: | 1.1.4cvs, option->keyboard german. Encoding latin1. | | When pressing ö (this funny german letter with two dots above an o) | I get something that looks like two dots above an o. | | Well, no big deal you might think, that's what it is supposed to do. | | But: The dots are far higher than 'normal' and it does not look like a | single character at all. Indeed, when looking at the .lyx file there | is a \i \"{o} instead of ö in the file. Is this also treu after reading the file again? What keynap are you using? Seems like Lyx does not think that this char can be directly showed/dispplayed, and used a InsetLatexInset to show it. Lgb