Re: how to solve this reduce/reduce conflict?
To interject. The 'modern' notion of spaces is in accord to how we now understand how parser-generators work, most particularly how BNF works. But in times long past this understanding was not understood. In Fortran II, IV, 77 spaces had no meaning. This meant all of: xy and x y were the same do 100 i = 1, 2, 3 and do1 00 i= 1 ,2 , 3 were the same. We now have a better understanding of BNF and compiling. The issue should be one of support. Towards this end let me make the following observations: 1. Spaces should be used as delimiters because it makes parsing easier. 2. If spaces are to have a special place then, if possible, the issue of treating spaces should be in flex. 3. If flex does not work, then the parser must be used. As in (Fortran II, IV, 77) doloop : "do" number id "=" number "," number "," number | id "=" number "," number "," number and I realize this is not complete, this is only an example. The thing that distinguishes an assignment (id "=" number) from a doloop is the ",". Based on your example I assume that having flex change "1 23" into "123" would be a better approach then using the parser. In this case it might be better to substitute some else for a space. If I remember correctly, Ada allows 1_234_567 whereas you would (perhaps) write this as "1 234 567". As to the reason that spaces are used in your application I think that this will make life difficult for you, and I personally would not do it. In similar manner some of your comments do not accord well with current practices, such as the separate meanings given "- 123" and "-123". I think you will face increasing difficulty in implementing some of your ideas and I would encourage you to redirect your efforts to more conventional approaches. The example given on typesetters has little relevance. Typesetters deal with the visual spectra. Parsers and lexers do not deal with the visual spectra, and they have well though out means of describing things. But, it's your code and your issue and I don't care. art On 9/23/2022 4:52 AM, AW wrote: On 22/09/2022 21:34, Derek Clegg wrote: This is horrid, and not how math works. Spaces necessarily mean nothing, and imbuing them with meaning is nonsense. if u want to say one-hundred-twenty-three u would not think, that C understands "1 23"... I feel like "-123" and "- 123" are not quite the same... That feels more natural to me... But I noticed that it was quite tricky to teach flex&bison the difference between sign and unary-inversion-operator... 😋 On 2022-09-23T12:47:23CEST Evan Lavelle wrote: It's a programming language, not maths. There are, of course, languages in which spaces necessarily mean something. But I can't bring myself to use any of them. Maybe this example demonstrates a use: "123" is usually not the same as "12 3"... Maybe something like "1 234 567" might be used in newspapers for "1234567", but the professional typesetters use special space symbols then, which are thinner and they stick hard to their neighbors, so that a newline is impossible at them... 😋 -arne
Re: how to solve this reduce/reduce conflict?
On 22/09/2022 21:34, Derek Clegg wrote: > This is horrid, and not how math works. Spaces necessarily mean nothing, > and imbuing them with meaning is nonsense. > if u want to say one-hundred-twenty-three u would not think, that C understands "1 23"... I feel like "-123" and "- 123" are not quite the same... That feels more natural to me... But I noticed that it was quite tricky to teach flex&bison the difference between sign and unary-inversion-operator... 😋 On 2022-09-23T12:47:23CEST Evan Lavelle wrote: > It's a programming language, not maths. There are, of course, languages > in which spaces necessarily mean something. But I can't bring myself to > use any of them. > Maybe this example demonstrates a use: "123" is usually not the same as "12 3"... Maybe something like "1 234 567" might be used in newspapers for "1234567", but the professional typesetters use special space symbols then, which are thinner and they stick hard to their neighbors, so that a newline is impossible at them... 😋 -arne
Re: how to solve this reduce/reduce conflict?
On 22/09/2022 21:34, Derek Clegg wrote: This is horrid, and not how math works. Spaces necessarily mean nothing, and imbuing them with meaning is nonsense. Please reconsider your grammar. It's a programming language, not maths. There are, of course, languages in which spaces necessarily mean something. But I can't bring myself to use any of them. I don't like it, but chacun a son gout.
Re: how to solve this reduce/reduce conflict?
This is horrid, and not how math works. Spaces necessarily mean nothing, and imbuing them with meaning is nonsense. Please reconsider your grammar. > On Sep 22, 2022, at 8:28 PM, Lukas Arsalan wrote: > > On 2022-09-22T15:54:31UTC Hans Åberg wrote: >> Context switches are best avoided unless absolutely necessary, in my >> experience. >> So if one designs ones own language, it might be good to try to avoid them >> by a change in the grammar. >> > OK... I know that there are no signed numbers usually... But I wanted to try > to change that... > So for _me_ in "-2" the minus is a sign... And in "- 2" the minus is a unary > inversion operator... And in "1-2" the minus is a subtraction operator (or > an abbreviation for "1+-2" respectively (where the minus is a sign again))... > This can all be done quite elegantly with this context trick in the ll-file... > >> It might be confusing with -2^4 meaning (-2)^4, because in 1 - 2^4, it >> should be 1 - (2^4), >> and 1 -2^4 would be an error if two number cannot follow each other. >> > "1 -2^4" is no error in my program... it results in "-15". > It even says, that "- 2^4" is "-16", while "-2^4" is "16". 🥳 > > Do u think there will be any unwanted side effects? > > -arne >
Re: how to solve this reduce/reduce conflict?
On 2022-09-22T15:54:31UTC Hans Åberg wrote: > Context switches are best avoided unless absolutely necessary, in my > experience. > So if one designs ones own language, it might be good to try to avoid them > by a change in the grammar. > OK... I know that there are no signed numbers usually... But I wanted to try to change that... So for _me_ in "-2" the minus is a sign... And in "- 2" the minus is a unary inversion operator... And in "1-2" the minus is a subtraction operator (or an abbreviation for "1+-2" respectively (where the minus is a sign again))... This can all be done quite elegantly with this context trick in the ll-file... > It might be confusing with -2^4 meaning (-2)^4, because in 1 - 2^4, it should > be 1 - (2^4), > and 1 -2^4 would be an error if two number cannot follow each other. > "1 -2^4" is no error in my program... it results in "-15". It even says, that "- 2^4" is "-16", while "-2^4" is "16". 🥳 Do u think there will be any unwanted side effects? -arne
Re: how to solve this reduce/reduce conflict?
> On 22 Sep 2022, at 21:02, Lukas Arsalan wrote: > > On 2022-09-22T15:54:31UTC Hans Åberg wrote: >> Context switches are best avoided unless absolutely necessary, in my >> experience. >> So if one designs ones own language, it might be good to try to avoid them >> by a change in the grammar. >> > OK... I know that there are no signed numbers usually... But I wanted to try > to change that... > So for _me_ in "-2" the minus is a sign... And in "- 2" the minus is a unary > inversion operator... And in "1-2" the minus is a subtraction operator (or > an abbreviation for "1+-2" respectively (where the minus is a sign again))... > This can all be done quite elegantly with this context trick in the ll-file... I think the C/C++ interpretation with a unary operator and no signed integers is the best one for arithmetic expressions. Having a sign as a prt of the numer may be suitable in other contexts. >> It might be confusing with -2^4 meaning (-2)^4, because in 1 - 2^4, it >> should be 1 - (2^4), >> and 1 -2^4 would be an error if two number cannot follow each other. >> > "1 -2^4" is no error in my program... it results in "-15". > It even says, that "- 2^4" is "-16", while "-2^4" is "16". 🥳 > > Do u think there will be any unwanted side effects? In the minds of those interpreting it. :-)
Re: how to solve this reduce/reduce conflict?
On 2022-09-22T07:57:45UTC Hans Åberg wrote: > On 22 Sep 2022, at 08:30, Lukas Arsalan wrote: >> [1] -1 --> "num" >> [2] 1-2 --> "num" "-" "num" >> [3] (-1^-2) --> "(" "num" "^" "num" ")" >> [4] 1--2 --> "num" "-" "num" >> [5] 1---3 --> "num" "-" "-" "num" >> [6] 1-2^3 --> "num" "-" "num" "^" "num" >> I do not think that it is possible, to do that with regular expressions... >> > I think it is not possible, so therefore one expects -2⁴ to be parsed as > -(2⁴). > I found that `%s nosinum` for the ll-file... Now I can do things like this: "+" BEGIN(INITIAL); return yy::parser::make_ADD(loc); "(" BEGIN(INITIAL); return yy::parser::make_BROP(loc); ")" BEGIN(nosinum); return yy::parser::make_BRCL(loc); {bint} BEGIN(nosinum); return make_INT(yytext,loc); {float} BEGIN(nosinum); return make_FLOAT(yytext,loc); [+-]?{bint} BEGIN(nosinum); return make_INT(yytext,loc); [+-]?{float} BEGIN(nosinum); return make_FLOAT(yytext,loc); and i removed the SNUM token... now it seems to work just right.. 🥳 it even handles the whitespaces to my liking... 😋 but i do not know what kind of formal language that is now... -arne
Re: how to solve this reduce/reduce conflict?
> On 22 Sep 2022, at 16:52, Lukas Arsalan wrote: > > On 2022-09-22T07:57:45UTC Hans Åberg wrote: >> On 22 Sep 2022, at 08:30, Lukas Arsalan wrote: >>> [1] -1 --> "num" >>> [2] 1-2 --> "num" "-" "num" >>> [3] (-1^-2) --> "(" "num" "^" "num" ")" >>> [4] 1--2 --> "num" "-" "num" >>> [5] 1---3 --> "num" "-" "-" "num" >>> [6] 1-2^3 --> "num" "-" "num" "^" "num" >>> I do not think that it is possible, to do that with regular expressions... >>> >> I think it is not possible, so therefore one expects -2⁴ to be parsed as >> -(2⁴). >> > I found that `%s nosinum` for the ll-file... > Now I can do things like this: > "+" BEGIN(INITIAL); return yy::parser::make_ADD(loc); > "(" BEGIN(INITIAL); return yy::parser::make_BROP(loc); > ")" BEGIN(nosinum); return yy::parser::make_BRCL(loc); > {bint} BEGIN(nosinum); return make_INT(yytext,loc); > {float} BEGIN(nosinum); return make_FLOAT(yytext,loc); > [+-]?{bint}BEGIN(nosinum); return make_INT(yytext,loc); > [+-]?{float} BEGIN(nosinum); return make_FLOAT(yytext,loc); > > and i removed the SNUM token... > > now it seems to work just right.. 🥳 > > it even handles the whitespaces to my liking... 😋 > > but i do not know what kind of formal language that is now... Context switches are best avoided unless absolutely necessary, in my experience. So if one designs ones own language, it might be good to try to avoid them by a change in the grammar. It might be confusing with -2^4 meaning (-2)^4, because in 1 - 2^4, it should be 1 - (2^4), and 1 -2^4 would be an error if two number cannot follow each other.
Re: how to solve this reduce/reduce conflict?
> On 22 Sep 2022, at 08:30, Lukas Arsalan wrote: > > Hi, > > At 2022-09-22T07:08:55CEST Akim Demaille wrote: >> This snippet is clearly ambiguous, since it allows two different parses of >> -1, which -Wcex nicely showed. >> > yes. right. > >> If I were you, I would handle this in the scanner. IOW, the scanner should >> be extended to support signed literals, and > process that initial `-`. >> > uhm... is that possible? > e. g.: > [1] -1 --> "num" > [2] 1-2 --> "num" "-" "num" > [3] (-1^-2) --> "(" "num" "^" "num" ")" > [4] 1--2 --> "num" "-" "num" > [5] 1---3 --> "num" "-" "-" "num" > [6] 1-2^3 --> "num" "-" "num" "^" "num" > I do not think that it is possible, to do that with regular expressions... I think it is not possible, so therefore one expects -2⁴ to be parsed as -(2⁴).
Re: how to solve this reduce/reduce conflict?
Hi, At 2022-09-22T07:08:55CEST Akim Demaille wrote: > This snippet is clearly ambiguous, since it allows two different parses of > -1, which -Wcex nicely showed. > yes. right. > If I were you, I would handle this in the scanner. IOW, the scanner should > be extended to support signed literals, and > process that initial `-`. > uhm... is that possible? e. g.: [1] -1 --> "num" [2] 1-2 --> "num" "-" "num" [3] (-1^-2) --> "(" "num" "^" "num" ")" [4] 1--2 --> "num" "-" "num" [5] 1---3 --> "num" "-" "-" "num" [6] 1-2^3 --> "num" "-" "num" "^" "num" I do not think that it is possible, to do that with regular expressions... flex would have to remember the previous token, that it found... A "-" after no token and after the tokens "-", "+", "^", "(" is a sign, if followed by a digit... else the "-" is an operator... That sound like bison's job... right? Or can it be done in flex? Or can bison reject a token (e. g. a "num" after a "num" in the case of "1-2") and make flex using a different rule? and i m still not sure, what to do with whitespaces (e.g. "- 3")... currently i just ignore them... What kind of grammars is bison capable of parsing? I mean: Is my grammar too complicated for flex&bison? > So the grammar would no longer include `exp: "num"`. > u mean`exp: "-" "num"`? > Your actions look quite badly typed. And `std::endl` should seldom be used, > `'\n'` is enough. > I am concerned about 1. parser errors 2. memory leaks 3. efficiency of debug code... :-) Thx. -arne
Re: how to solve this reduce/reduce conflict?
Hi, > Le 21 sept. 2022 à 23:31, Lukas Arsalan a écrit : > > exp: >"-" "num"{ $$ = -*new Float($2); std::cout << "NUMinv" << $$ > << std::endl; } > | "num"{ $$ = new Float($1); std::cout << "num" << $$ << > std::endl; } > | "-" exp { $$ = -*$2; std::cout << "inv" << $$ << std::endl; } This snippet is clearly ambiguous, since it allows two different parses of -1, which -Wcex nicely showed. If I were you, I would handle this in the scanner. IOW, the scanner should be extended to support signed literals, and process that initial `-`. So the grammar would no longer include `exp: "num"`. Your actions look quite badly typed. And `std::endl` should seldom be used, `'\n'` is enough. Cheers!