Re: Strange feature

1999-12-21 Thread Lars Gullik Bjønnes

Jean-Marc Lasgouttes [EMAIL PROTECTED] writes:

|  "Lars" == Lars Gullik Bjønnes [EMAIL PROTECTED] writes:
| 
| Lars The escaping was not just bad, it was plain wrong. run lyx with
| Lars -dbg key,keymap and you can see yourself. Why demand escaping
| Lars when it is not needed.
| 
| The escaping was right with LyXLex, AFAIK.

With lyxlex it didn't matter since the result was not used anyway.

| Sure, if I add a space at the beginning of the line for some reason,
| it does not get parsed. On the orhet hand, I can add any junk at the
| end of the line, and it will be ignored. Talk about robustness... I
| know the regexp can be made better,

Sure it can.
"^[ \t]*([12][0-9][0-9])[ \t]+\"([^ ]+)\"[ \t]+.*"


| 
| Lars | - the other cdef files have not been modified, and thus the
| Lars problem | remain for them
| 
| Lars Note that the actual contents of the .cdef files have not been
| Lars used until I made my changes yesterday.
| 
| Yes, but they were used before you ditched lyxlex, I believe.

No they were not. At least not in 1.0.4, just look at
 CharacterSet::encodeString in chset.C
it returns true if we have a str matching one of the strings in the
cmap, the number is never used.

| And maybe is it just that I do not know regexp well enough, but how
| are you sure that the match for a string like "\"{e}" will not be just
| "\"? Are _all_ regexp implementations required to do this? 

Yes, a regex will alway try to match as much as possible.

| 
| Lars | - If you do not like those escape in the old files, we can in
| Lars fact | just forget about the quotes and it should remove the
| Lars need for | quoting (does it? if not this is a bug in lyxlex).
| 
| Lars Yes that was one of my comments, why use two other chars as
| Lars delimiters.
| 
| What I definitely do not like about your solution is that cdef file do
| not have a syntax anymore (unless you count `anything that matches my
| regexp' as a syntax). In fact, the only way to document the syntax is
| probably to give the regexp...

Syntax:

nnn "tex-char"

I agree that "" looks a bit weird when you allow " inside to.

\def{tex-char}{nnn}

could be an alternative

| Lars Wrong use of lyxlex. To use lyxlex as a glorified tokenizer is
| Lars not right.
| 
| Why? A tokenizer is just what is needed here. If you want nice tokens,
| just modify the syntax to be
|   Chardef 192 "\\`{A}"
| and you have nice tokens to play with...

How will that be any nicer? Then lyxlex have one item in its keyword
table, and you still have to "manually" parse the rest of the line.

I really think that to avoid the escaping is nice.

Lgb



Re: Strange feature

1999-12-21 Thread Jean-Marc Lasgouttes

 "Lars" == Lars Gullik Bjønnes [EMAIL PROTECTED] writes:

Lars Sure it can. "^[ \t]*([12][0-9][0-9])[ \t]+\"([^ ]+)\"[ \t]+.*"

Wouldn't something like
"^[ \t]*([12][0-9][0-9])[ \t]+\"(.*)\"[ \t]*$"
be better? It avoids junk at the end, does not force a space after the
final ".

Lars No they were not. At least not in 1.0.4, just look at
Lars CharacterSet::encodeString in chset.C it returns true if we have
Lars a str matching one of the strings in the cmap, the number is
Lars never used.

Yes, I saw that, but I though that it was used elsewhere I did not
see. What is the theory of how this stuff is used, anyway? 

Lars \def{tex-char}{nnn}

Lars could be an alternative

Since nnn almost surely contains braces (\'{e}), this would not be
better. Moreover, this would use a false tex syntax that does not make
sense in TeX. We could want to use the syntax of LaTeX inputenc files,
though, and directly parse that...

Lars How will that be any nicer? Then lyxlex have one item in its
Lars keyword table, and you still have to "manually" parse the rest
Lars of the line.

Except that the 'manually' part does certainly not use more lines of
code than the smart regexp code.

Lars I really think that to avoid the escaping is nice.

Well, you can have that with LyXLex if you do not quote the second
argument. 



Re: Strange "feature"

1999-12-21 Thread Lars Gullik Bjønnes

Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes:

| > "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes:
| 
| Lars> The escaping was not just bad, it was plain wrong. run lyx with
| Lars> -dbg key,keymap and you can see yourself. Why demand escaping
| Lars> when it is not needed.
| 
| The escaping was right with LyXLex, AFAIK.

With lyxlex it didn't matter since the result was not used anyway.

| Sure, if I add a space at the beginning of the line for some reason,
| it does not get parsed. On the orhet hand, I can add any junk at the
| end of the line, and it will be ignored. Talk about robustness... I
| know the regexp can be made better,

Sure it can.
"^[ \t]*([12][0-9][0-9])[ \t]+\"([^ ]+)\"[ \t]+.*"


| 
| Lars> | - the other cdef files have not been modified, and thus the
| Lars> problem | remain for them
| 
| Lars> Note that the actual contents of the .cdef files have not been
| Lars> used until I made my changes yesterday.
| 
| Yes, but they were used before you ditched lyxlex, I believe.

No they were not. At least not in 1.0.4, just look at
 CharacterSet::encodeString in chset.C
it returns true if we have a str matching one of the strings in the
cmap, the number is never used.

| And maybe is it just that I do not know regexp well enough, but how
| are you sure that the match for a string like "\"{e}" will not be just
| "\"? Are _all_ regexp implementations required to do this? 

Yes, a regex will alway try to match as much as possible.

| 
| Lars> | - If you do not like those escape in the old files, we can in
| Lars> fact | just forget about the quotes and it should remove the
| Lars> need for | quoting (does it? if not this is a bug in lyxlex).
| 
| Lars> Yes that was one of my comments, why use two other chars as
| Lars> delimiters.
| 
| What I definitely do not like about your solution is that cdef file do
| not have a syntax anymore (unless you count `anything that matches my
| regexp' as a syntax). In fact, the only way to document the syntax is
| probably to give the regexp...

Syntax:

nnn ""

I agree that "" looks a bit weird when you allow " inside to.

\def{}{nnn}

could be an alternative

| Lars> Wrong use of lyxlex. To use lyxlex as a glorified tokenizer is
| Lars> not right.
| 
| Why? A tokenizer is just what is needed here. If you want nice tokens,
| just modify the syntax to be
|   Chardef 192 "\\`{A}"
| and you have nice tokens to play with...

How will that be any nicer? Then lyxlex have one item in its keyword
table, and you still have to "manually" parse the rest of the line.

I really think that to avoid the escaping is nice.

Lgb



Re: Strange "feature"

1999-12-21 Thread Jean-Marc Lasgouttes

> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes:

Lars> Sure it can. "^[ \t]*([12][0-9][0-9])[ \t]+\"([^ ]+)\"[ \t]+.*"

Wouldn't something like
"^[ \t]*([12][0-9][0-9])[ \t]+\"(.*)\"[ \t]*$"
be better? It avoids junk at the end, does not force a space after the
final ".

Lars> No they were not. At least not in 1.0.4, just look at
Lars> CharacterSet::encodeString in chset.C it returns true if we have
Lars> a str matching one of the strings in the cmap, the number is
Lars> never used.

Yes, I saw that, but I though that it was used elsewhere I did not
see. What is the theory of how this stuff is used, anyway? 

Lars> \def{}{nnn}

Lars> could be an alternative

Since nnn almost surely contains braces (\'{e}), this would not be
better. Moreover, this would use a false tex syntax that does not make
sense in TeX. We could want to use the syntax of LaTeX inputenc files,
though, and directly parse that...

Lars> How will that be any nicer? Then lyxlex have one item in its
Lars> keyword table, and you still have to "manually" parse the rest
Lars> of the line.

Except that the 'manually' part does certainly not use more lines of
code than the smart regexp code.

Lars> I really think that to avoid the escaping is nice.

Well, you can have that with LyXLex if you do not quote the second
argument. 



Re: Strange feature

1999-12-17 Thread Lars Gullik Bjønnes

Jean-Marc Lasgouttes [EMAIL PROTECTED] writes:

|  "Jean-Marc" == Jean-Marc Lasgouttes [EMAIL PROTECTED] writes:
| 
| Jean-Marc After debugging this a bit, it seems that you broke the
| Jean-Marc parsing of cdef files with the new regexp stuff. The
| Jean-Marc problem is that you do not handle backslash escapes, and
| Jean-Marc \\\"{o} is kept as is in chset map, instead of transforming
| Jean-Marc it to \"{o}.
| 
| Lars, I see that you  decided that escaping was bad and changed
| iso8859-1.cdef to reflect what you think is good. However, I see
| several problems:

The escaping was not just bad, it was plain wrong.
run lyx with -dbg key,keymap and you can see yourself.
Why demand escaping when it is not needed.

| - the concerns I have with the non robustness of your regexp-based
|   parser remain. I understand that regexps are fun, but...

if the line match it match, else it does not match. As for robustness,
it does not crash.

| - the other cdef files have not been modified, and thus the problem
|   remain for them

Note that the actual contents of the .cdef
files have not been used until I made my changes yesterday.

| - the current syntax does not make sense, since we have non-escaped "
|   characters inside "" groups.

So?
that matches ' "[^ ]" '

| - If you do not like those escape in the old files, we can in fact
|   just forget about the quotes and it should remove the need for
|   quoting (does it? if not this is a bug in lyxlex).

Yes that was one of my comments, why use two other chars as
delimiters.


| Why not just revert to the old parser?

Wrong use of lyxlex.
To use lyxlex as a glorified tokenizer is not right.

Lgb



Re: Strange "feature"

1999-12-17 Thread Lars Gullik Bjønnes

Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes:

| > "Jean-Marc" == Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes:
| 
| Jean-Marc> After debugging this a bit, it seems that you broke the
| Jean-Marc> parsing of cdef files with the new regexp stuff. The
| Jean-Marc> problem is that you do not handle backslash escapes, and
| Jean-Marc> \\\"{o} is kept as is in chset map, instead of transforming
| Jean-Marc> it to \"{o}.
| 
| Lars, I see that you  decided that escaping was bad and changed
| iso8859-1.cdef to reflect what you think is good. However, I see
| several problems:

The escaping was not just bad, it was plain wrong.
run lyx with -dbg key,keymap and you can see yourself.
Why demand escaping when it is not needed.

| - the concerns I have with the non robustness of your regexp-based
|   parser remain. I understand that regexps are fun, but...

if the line match it match, else it does not match. As for robustness,
it does not crash.

| - the other cdef files have not been modified, and thus the problem
|   remain for them

Note that the actual contents of the .cdef
files have not been used until I made my changes yesterday.

| - the current syntax does not make sense, since we have non-escaped "
|   characters inside "" groups.

So?
that matches ' "[^ ]" '

| - If you do not like those escape in the old files, we can in fact
|   just forget about the quotes and it should remove the need for
|   quoting (does it? if not this is a bug in lyxlex).

Yes that was one of my comments, why use two other chars as
delimiters.


| Why not just revert to the old parser?

Wrong use of lyxlex.
To use lyxlex as a glorified tokenizer is not right.

Lgb



Re: Strange feature

1999-12-15 Thread Jean-Marc Lasgouttes

 "Lars" == Lars Gullik Bjønnes [EMAIL PROTECTED] writes:

Lars Is this also treu after reading the file again? What keynap are
Lars you using?

Lars Seems like Lyx does not think that this char can be directly
Lars showed/dispplayed, and used a InsetLatexInset to show it.

After debugging this a bit, it seems that you broke the parsing of
cdef files with the new regexp stuff. The problem is that you do not
handle backslash escapes, and \\\"{o} is kept as is in chset map,
instead of transforming it to \"{o}.

It does not seem to me that the new parsing is significantly better
that plain old lyxlex... 

Moreover, your regexp "^([12][0-9][0-9])[ \t]+\"([^ ]+)\".*" does not
allow for spaces at the beginning of a line, but allows any junk at
the end of a line... And any wrongly formed line is silently ignored
as if it were a comment. Frankly, I think the old code was much better
(at least for _this_ particular file: there might be advantages to
regexps in other cases). If you really like regexps, you should design
a regexp-based parse class, which handles all the robustness matters.

JMarc



Re: Strange feature

1999-12-15 Thread Andre' Poenitz

 Lars Is this also treu after reading the file again? What keynap are
 

This is about the third or fourth mail from Lars that gets cited in the
list although I did not receive the original. Since the question looks
like Lars asked me (it was my problem) I should have received at least
one copy, shouldn't I?

Any idea? I am pretty sure Lars is not in any kind of killfile here...

Andre'



Re: Strange feature

1999-12-15 Thread Jean-Marc Lasgouttes

 "Andre'" == Andre' Poenitz [EMAIL PROTECTED] writes:

Lars Is this also treu after reading the file again? What keynap are


Andre' This is about the third or fourth mail from Lars that gets
Andre' cited in the list although I did not receive the original.
Andre' Since the question looks like Lars asked me (it was my
Andre' problem) I should have received at least one copy, shouldn't
Andre' I?

Andre' Any idea? I am pretty sure Lars is not in any kind of killfile
Andre' here...

That's strange... The message is correctly archived at
http://www.mail-archive.com/lyx-devel@lists.lyx.org/msg08012.html

I append the header of the message to this message. The only notable
thing is that it has been sent to lyx-devel but not CC'd to you (which
is rather a good idea).

JMarc

Mail-from: From [EMAIL PROTECTED]
 Wed Dec 15 02:04:03 1999
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78])
by fantomas.inria.fr (8.8.5/8.8.7) with ESMTP id CAA09638
for [EMAIL PROTECTED]; Wed, 15 Dec 1999 02:04:03 +0100 (MET)
Received: from wierdlmpc.msci.memphis.edu (wierdlmpc.msci.memphis.edu [141.225.11
.87])
by nez-perce.inria.fr (8.8.7/8.8.7) with SMTP id CAA08816
for [EMAIL PROTECTED]; Wed, 15 Dec 1999 02:04:02 +0100 (MET
)
Received: (qmail 14883 invoked by uid 514); 15 Dec 1999 01:04:12 -
Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm
Precedence: bulk
X-No-Archive: yes
List-Unsubscribe: mailto:lyx-devel-unsubscribe-Jean-Marc.Lasgouttes=inria.fr@lis
ts.lyx.org
Delivered-To: mailing list [EMAIL PROTECTED]
Received: (qmail 14873 invoked from network); 15 Dec 1999 01:04:11 -
To: [EMAIL PROTECTED]
Subject: Re: Strange "feature"
References: [EMAIL PROTECTED]
From: [EMAIL PROTECTED] (Lars Gullik Bjønnes)
Date: 15 Dec 1999 02:03:46 +0100
In-Reply-To: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
X-Mailer: Gnus v5.5/Emacs 20.3
Lines: 20
Xref: fantomas.inria.fr lyx-devel:1968
X-Gnus-Newsgroup: lyx-devel:1968   Wed Dec 15 10:31:15 1999



Re: Strange feature

1999-12-15 Thread Lars Gullik Bjønnes

"Andre' Poenitz" [EMAIL PROTECTED] writes:

|  Lars Is this also treu after reading the file again? What keynap are
|  
| 
| This is about the third or fourth mail from Lars that gets cited in the
| list although I did not receive the original. Since the question looks
| like Lars asked me (it was my problem) I should have received at least
| one copy, shouldn't I?
| 
| Any idea? I am pretty sure Lars is not in any kind of killfile here...

Please forward this to Andre'...

The mailserver at your site does not allow 8bit characters in mail
heaaders, althogh this is strictly correct in regard to the  RFC
(822?) it is not good in the real world, and there is now really good
reason do disallow 8bit characters in mail headers, pester your
mailadmins to upgrade the mailserver.

(may name is my name is my name)

Lgb



Re: Strange "feature"

1999-12-15 Thread Jean-Marc Lasgouttes

> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes:

Lars> Is this also treu after reading the file again? What keynap are
Lars> you using?

Lars> Seems like Lyx does not think that this char can be directly
Lars> showed/dispplayed, and used a InsetLatexInset to show it.

After debugging this a bit, it seems that you broke the parsing of
cdef files with the new regexp stuff. The problem is that you do not
handle backslash escapes, and \\\"{o} is kept as is in chset map,
instead of transforming it to \"{o}.

It does not seem to me that the new parsing is significantly better
that plain old lyxlex... 

Moreover, your regexp "^([12][0-9][0-9])[ \t]+\"([^ ]+)\".*" does not
allow for spaces at the beginning of a line, but allows any junk at
the end of a line... And any wrongly formed line is silently ignored
as if it were a comment. Frankly, I think the old code was much better
(at least for _this_ particular file: there might be advantages to
regexps in other cases). If you really like regexps, you should design
a regexp-based parse class, which handles all the robustness matters.

JMarc



Re: Strange "feature"

1999-12-15 Thread Andre' Poenitz

> Lars> Is this also treu after reading the file again? What keynap are
> 

This is about the third or fourth mail from Lars that gets cited in the
list although I did not receive the original. Since the question looks
like Lars asked me (it was my problem) I should have received at least
one copy, shouldn't I?

Any idea? I am pretty sure Lars is not in any kind of killfile here...

Andre'



Re: Strange "feature"

1999-12-15 Thread Jean-Marc Lasgouttes

>>>>> "Andre'" == Andre' Poenitz <[EMAIL PROTECTED]> writes:

Lars> Is this also treu after reading the file again? What keynap are
>>

Andre'> This is about the third or fourth mail from Lars that gets
Andre'> cited in the list although I did not receive the original.
Andre'> Since the question looks like Lars asked me (it was my
Andre'> problem) I should have received at least one copy, shouldn't
Andre'> I?

Andre'> Any idea? I am pretty sure Lars is not in any kind of killfile
Andre'> here...

That's strange... The message is correctly archived at
http://www.mail-archive.com/lyx-devel@lists.lyx.org/msg08012.html

I append the header of the message to this message. The only notable
thing is that it has been sent to lyx-devel but not CC'd to you (which
is rather a good idea).

JMarc

Mail-from: From [EMAIL PROTECTED]
 Wed Dec 15 02:04:03 1999
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78])
by fantomas.inria.fr (8.8.5/8.8.7) with ESMTP id CAA09638
for <[EMAIL PROTECTED]>; Wed, 15 Dec 1999 02:04:03 +0100 (MET)
Received: from wierdlmpc.msci.memphis.edu (wierdlmpc.msci.memphis.edu [141.225.11
.87])
by nez-perce.inria.fr (8.8.7/8.8.7) with SMTP id CAA08816
for <[EMAIL PROTECTED]>; Wed, 15 Dec 1999 02:04:02 +0100 (MET
)
Received: (qmail 14883 invoked by uid 514); 15 Dec 1999 01:04:12 -
Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm
Precedence: bulk
X-No-Archive: yes
List-Unsubscribe: <mailto:lyx-devel-unsubscribe-Jean-Marc.Lasgouttes=inria.fr@lis
ts.lyx.org>
Delivered-To: mailing list [EMAIL PROTECTED]
Received: (qmail 14873 invoked from network); 15 Dec 1999 01:04:11 -
To: [EMAIL PROTECTED]
Subject: Re: Strange "feature"
References: <[EMAIL PROTECTED]>
From: [EMAIL PROTECTED] (Lars Gullik Bjønnes)
Date: 15 Dec 1999 02:03:46 +0100
In-Reply-To: <[EMAIL PROTECTED]>
Message-ID: <[EMAIL PROTECTED]>
X-Mailer: Gnus v5.5/Emacs 20.3
Lines: 20
Xref: fantomas.inria.fr lyx-devel:1968
X-Gnus-Newsgroup: lyx-devel:1968   Wed Dec 15 10:31:15 1999



Re: Strange "feature"

1999-12-15 Thread Lars Gullik Bjønnes

"Andre' Poenitz" <[EMAIL PROTECTED]> writes:

| > Lars> Is this also treu after reading the file again? What keynap are
| > 
| 
| This is about the third or fourth mail from Lars that gets cited in the
| list although I did not receive the original. Since the question looks
| like Lars asked me (it was my problem) I should have received at least
| one copy, shouldn't I?
| 
| Any idea? I am pretty sure Lars is not in any kind of killfile here...

Please forward this to Andre'...

The mailserver at your site does not allow 8bit characters in mail
heaaders, althogh this is strictly correct in regard to the  RFC
(822?) it is not good in the real world, and there is now really good
reason do disallow 8bit characters in mail headers, pester your
mailadmins to upgrade the mailserver.

(may name is my name is my name)

Lgb



Re: Strange feature

1999-12-14 Thread Lars Gullik Bjønnes

"Andre' Poenitz" [EMAIL PROTECTED] writes:

| 1.1.4cvs, option-keyboard german. Encoding latin1.
| 
| When pressing ö (this funny german letter with two dots above an o)
| I get something that looks like two dots above an o.
| 
| Well, no big deal you might think, that's what it is supposed to do.
| 
| But: The dots are far higher than 'normal' and it does not look like a
| single character at all. Indeed, when looking at the .lyx file there
| is a   \i \"{o}  instead of ö in the file.

Is this also treu after reading the file again?
What keynap are you using?

Seems like Lyx does not think that this char can be directly
showed/dispplayed, and used a InsetLatexInset to show it.

Lgb



Re: Strange "feature"

1999-12-14 Thread Lars Gullik Bjønnes

"Andre' Poenitz" <[EMAIL PROTECTED]> writes:

| 1.1.4cvs, option->keyboard german. Encoding latin1.
| 
| When pressing ö (this funny german letter with two dots above an o)
| I get something that looks like two dots above an o.
| 
| Well, no big deal you might think, that's what it is supposed to do.
| 
| But: The dots are far higher than 'normal' and it does not look like a
| single character at all. Indeed, when looking at the .lyx file there
| is a   \i \"{o}  instead of ö in the file.

Is this also treu after reading the file again?
What keynap are you using?

Seems like Lyx does not think that this char can be directly
showed/dispplayed, and used a InsetLatexInset to show it.

Lgb