Re: how to solve this reduce/reduce conflict?

2022-09-23 Thread lostbits
To interject. The 'modern' notion of spaces is in accord to how we now 
understand how parser-generators work, most particularly how BNF works. 
But in times long past this understanding was not understood. In Fortran 
II, IV, 77 spaces had no meaning. This meant all of:


   xy and x y were the same
   do 100 i = 1, 2, 3 and do1 00  i= 1 ,2   , 3 were the same.

We now have a better understanding of BNF and compiling. The issue 
should be one of support. Towards this end let me make the following 
observations:


1.    Spaces should be used as delimiters because it makes parsing easier.
2.    If spaces are to have a special place then, if possible, the issue 
of treating spaces should be in flex.
3.    If flex does not work, then the parser must be used. As in 
(Fortran II, IV, 77)


   doloop : "do" number id "=" number "," number "," number
 |  id "=" number "," number "," number
   and I realize this is not complete, this is only an example.

   The thing that distinguishes an assignment (id "=" number) from a
   doloop is the ",".

Based on your example I assume that having flex change "1 23" into "123" 
would be a better approach then using the parser. In this case it might 
be better to substitute some else for a space. If I remember correctly, 
Ada allows 1_234_567 whereas you would (perhaps) write this as "1 234 567".


As to the reason that spaces are used in your application I think that 
this will make life difficult for you, and I personally would not do it. 
In similar manner some of your comments do not accord well with current 
practices, such as the separate meanings given "- 123" and "-123". I 
think you will face increasing difficulty in implementing some of your 
ideas and I would encourage you to redirect your efforts to more 
conventional approaches.


The example given on typesetters has little relevance. Typesetters deal 
with the visual spectra. Parsers and lexers do not deal with the visual 
spectra, and they have well though out means of describing things.


But, it's your code and your issue and I don't care.

art

On 9/23/2022 4:52 AM, AW wrote:

On 22/09/2022 21:34, Derek Clegg wrote:

This is horrid, and not how math works. Spaces necessarily mean nothing,
and imbuing them with meaning is nonsense.


if u want to say one-hundred-twenty-three u would not think, that C understands "1 
23"...
I feel like "-123" and "- 123" are not quite the same...
That feels more natural to me...
But I noticed that it was quite tricky to teach flex the difference
between sign and unary-inversion-operator... 

On 2022-09-23T12:47:23CEST Evan Lavelle  wrote:

It's a programming language, not maths. There are, of course, languages
in which spaces necessarily mean something. But I can't bring myself to
use any of them.


Maybe this example demonstrates a use: "123" is usually not the same as "12  
    3"...
Maybe something like "1 234 567" might be used in newspapers for "1234567",
but the professional typesetters use special space symbols then,
which are thinner and they stick hard to their neighbors,
so that a newline is impossible at them... 

-arne




Re: how to solve this reduce/reduce conflict?

2022-09-23 Thread AW
On 22/09/2022 21:34, Derek Clegg wrote:
> This is horrid, and not how math works. Spaces necessarily mean nothing,
> and imbuing them with meaning is nonsense.
>
if u want to say one-hundred-twenty-three u would not think, that C understands 
"1 23"...
I feel like "-123" and "- 123" are not quite the same...
That feels more natural to me...
But I noticed that it was quite tricky to teach flex the difference
between sign and unary-inversion-operator... 

On 2022-09-23T12:47:23CEST Evan Lavelle  wrote:
> It's a programming language, not maths. There are, of course, languages 
> in which spaces necessarily mean something. But I can't bring myself to 
> use any of them.
>
Maybe this example demonstrates a use: "123" is usually not the same as "12 
 3"...
Maybe something like "1 234 567" might be used in newspapers for "1234567",
but the professional typesetters use special space symbols then,
which are thinner and they stick hard to their neighbors,
so that a newline is impossible at them... 

-arne




Re: how to solve this reduce/reduce conflict?

2022-09-23 Thread Evan Lavelle

On 22/09/2022 21:34, Derek Clegg wrote:

This is horrid, and not how math works. Spaces necessarily mean nothing, and 
imbuing them with meaning is nonsense.

Please reconsider your grammar.


It's a programming language, not maths. There are, of course, languages 
in which spaces necessarily mean something. But I can't bring myself to 
use any of them.


I don't like it, but chacun a son gout.




Re: how to solve this reduce/reduce conflict?

2022-09-22 Thread Derek Clegg
This is horrid, and not how math works. Spaces necessarily mean nothing, and 
imbuing them with meaning is nonsense. 

Please reconsider your grammar. 

> On Sep 22, 2022, at 8:28 PM, Lukas Arsalan  wrote:
> 
> On 2022-09-22T15:54:31UTC Hans Åberg  wrote:
>> Context switches are best avoided unless absolutely necessary, in my 
>> experience.
>> So if one designs ones own language, it might be good to try to avoid them
>> by a change in the grammar.
>> 
> OK... I know that there are no signed numbers usually... But I wanted to try 
> to change that...
> So for _me_ in "-2" the minus is a sign... And in "- 2" the minus is a unary 
> inversion operator... And in "1-2"  the minus is a subtraction operator (or 
> an abbreviation for "1+-2" respectively (where the minus is a sign again))...
> This can all be done quite elegantly with this context trick in the ll-file...
> 
>> It might be confusing with -2^4 meaning (-2)^4, because in 1 - 2^4, it 
>> should be 1 - (2^4),
>> and 1 -2^4 would be an error if two number cannot follow each other.
>> 
> "1 -2^4" is no error in my program... it results in "-15".
> It even says, that "- 2^4" is "-16", while "-2^4" is "16". 拾
> 
> Do u think there will be any unwanted side effects?
> 
> -arne
> 



Re: how to solve this reduce/reduce conflict?

2022-09-22 Thread Lukas Arsalan
On 2022-09-22T15:54:31UTC Hans Åberg  wrote:
 > Context switches are best avoided unless absolutely necessary, in my 
 > experience.
> So if one designs ones own language, it might be good to try to avoid them
> by a change in the grammar.
>
OK... I know that there are no signed numbers usually... But I wanted to try to 
change that...
So for _me_ in "-2" the minus is a sign... And in "- 2" the minus is a unary 
inversion operator... And in "1-2"  the minus is a subtraction operator (or an 
abbreviation for "1+-2" respectively (where the minus is a sign again))...
This can all be done quite elegantly with this context trick in the ll-file...

> It might be confusing with -2^4 meaning (-2)^4, because in 1 - 2^4, it should 
> be 1 - (2^4),
> and 1 -2^4 would be an error if two number cannot follow each other.
>
"1 -2^4" is no error in my program... it results in "-15".
It even says, that "- 2^4" is "-16", while "-2^4" is "16". 拾

Do u think there will be any unwanted side effects?

-arne



Re: how to solve this reduce/reduce conflict?

2022-09-22 Thread Hans Åberg


> On 22 Sep 2022, at 21:02, Lukas Arsalan  wrote:
> 
> On 2022-09-22T15:54:31UTC Hans Åberg  wrote:
>> Context switches are best avoided unless absolutely necessary, in my 
>> experience.
>> So if one designs ones own language, it might be good to try to avoid them
>> by a change in the grammar.
>> 
> OK... I know that there are no signed numbers usually... But I wanted to try 
> to change that...
> So for _me_ in "-2" the minus is a sign... And in "- 2" the minus is a unary 
> inversion operator... And in "1-2"  the minus is a subtraction operator (or 
> an abbreviation for "1+-2" respectively (where the minus is a sign again))...
> This can all be done quite elegantly with this context trick in the ll-file...

I think the C/C++ interpretation with a unary operator and no signed integers 
is the best one for arithmetic expressions. Having a sign as a prt of the numer 
may be suitable in other contexts.

>> It might be confusing with -2^4 meaning (-2)^4, because in 1 - 2^4, it 
>> should be 1 - (2^4),
>> and 1 -2^4 would be an error if two number cannot follow each other.
>> 
> "1 -2^4" is no error in my program... it results in "-15".
> It even says, that "- 2^4" is "-16", while "-2^4" is "16". 拾
> 
> Do u think there will be any unwanted side effects?

In the minds of those interpreting it. :-)





Re: how to solve this reduce/reduce conflict?

2022-09-22 Thread Lukas Arsalan
On 2022-09-22T07:57:45UTC Hans Åberg  wrote:
> On 22 Sep 2022, at 08:30, Lukas Arsalan  wrote:
>> [1] -1 --> "num"
>> [2] 1-2 --> "num" "-" "num"
>> [3] (-1^-2) --> "(" "num" "^" "num" ")"
>> [4] 1--2 --> "num" "-" "num"
>> [5] 1---3 --> "num" "-" "-" "num"
>> [6] 1-2^3 --> "num" "-" "num" "^" "num"
>> I do not think that it is possible, to do that with regular expressions...
>>
> I think it is not possible, so therefore one expects -2⁴ to be parsed as 
> -(2⁴).
>
I found that `%s nosinum` for the ll-file...
Now I can do things like this:
"+" BEGIN(INITIAL); return yy::parser::make_ADD(loc);
"(" BEGIN(INITIAL); return yy::parser::make_BROP(loc);
")" BEGIN(nosinum); return yy::parser::make_BRCL(loc);
{bint}  BEGIN(nosinum); return make_INT(yytext,loc);
{float} BEGIN(nosinum); return make_FLOAT(yytext,loc);
[+-]?{bint}    BEGIN(nosinum); return make_INT(yytext,loc);
[+-]?{float}   BEGIN(nosinum); return make_FLOAT(yytext,loc);

and i removed the SNUM token...

now it seems to work just right.. 拾

it even handles the whitespaces to my liking... 

but i do not know what kind of formal language that is now...

-arne



Re: how to solve this reduce/reduce conflict?

2022-09-22 Thread Hans Åberg


> On 22 Sep 2022, at 16:52, Lukas Arsalan  wrote:
> 
> On 2022-09-22T07:57:45UTC Hans Åberg  wrote:
>> On 22 Sep 2022, at 08:30, Lukas Arsalan  wrote:
>>> [1] -1 --> "num"
>>> [2] 1-2 --> "num" "-" "num"
>>> [3] (-1^-2) --> "(" "num" "^" "num" ")"
>>> [4] 1--2 --> "num" "-" "num"
>>> [5] 1---3 --> "num" "-" "-" "num"
>>> [6] 1-2^3 --> "num" "-" "num" "^" "num"
>>> I do not think that it is possible, to do that with regular expressions...
>>> 
>> I think it is not possible, so therefore one expects -2⁴ to be parsed as 
>> -(2⁴).
>> 
> I found that `%s nosinum` for the ll-file...
> Now I can do things like this:
> "+" BEGIN(INITIAL); return yy::parser::make_ADD(loc);
> "(" BEGIN(INITIAL); return yy::parser::make_BROP(loc);
> ")" BEGIN(nosinum); return yy::parser::make_BRCL(loc);
> {bint}  BEGIN(nosinum); return make_INT(yytext,loc);
> {float} BEGIN(nosinum); return make_FLOAT(yytext,loc);
> [+-]?{bint}BEGIN(nosinum); return make_INT(yytext,loc);
> [+-]?{float}   BEGIN(nosinum); return make_FLOAT(yytext,loc);
> 
> and i removed the SNUM token...
> 
> now it seems to work just right.. 拾
> 
> it even handles the whitespaces to my liking... 
> 
> but i do not know what kind of formal language that is now...

Context switches are best avoided unless absolutely necessary, in my 
experience. So if one designs ones own language, it might be good to try to 
avoid them by a change in the grammar.

It might be confusing with -2^4 meaning (-2)^4, because in 1 - 2^4, it should 
be 1 - (2^4), and 1 -2^4 would be an error if two number cannot follow each 
other.





Re: how to solve this reduce/reduce conflict?

2022-09-22 Thread Hans Åberg


> On 22 Sep 2022, at 08:30, Lukas Arsalan  wrote:
> 
> Hi,
> 
> At 2022-09-22T07:08:55CEST Akim Demaille  wrote:
>> This snippet is clearly ambiguous, since it allows two different parses of 
>> -1, which -Wcex nicely showed.
>> 
> yes. right.
> 
>> If I were you, I would handle this in the scanner.  IOW, the scanner should 
>> be extended to support signed literals, and > process that initial `-`.
>> 
> uhm... is that possible?
> e. g.:
> [1] -1 --> "num"
> [2] 1-2 --> "num" "-" "num"
> [3] (-1^-2) --> "(" "num" "^" "num" ")"
> [4] 1--2 --> "num" "-" "num"
> [5] 1---3 --> "num" "-" "-" "num"
> [6] 1-2^3 --> "num" "-" "num" "^" "num"
> I do not think that it is possible, to do that with regular expressions...

I think it is not possible, so therefore one expects -2⁴ to be parsed as -(2⁴).





Re: how to solve this reduce/reduce conflict?

2022-09-22 Thread Lukas Arsalan
Hi,

At 2022-09-22T07:08:55CEST Akim Demaille  wrote:
> This snippet is clearly ambiguous, since it allows two different parses of 
> -1, which -Wcex nicely showed.
>
yes. right.

> If I were you, I would handle this in the scanner.  IOW, the scanner should 
> be extended to support signed literals, and > process that initial `-`.
>
uhm... is that possible?
e. g.:
[1] -1 --> "num"
[2] 1-2 --> "num" "-" "num"
[3] (-1^-2) --> "(" "num" "^" "num" ")"
[4] 1--2 --> "num" "-" "num"
[5] 1---3 --> "num" "-" "-" "num"
[6] 1-2^3 --> "num" "-" "num" "^" "num"
I do not think that it is possible, to do that with regular expressions...
flex would have to remember the previous token, that it found...
A "-" after no token and after the tokens "-", "+", "^", "(" is a sign, if 
followed by a digit...
else the "-" is an operator...
That sound like bison's job... right?

Or can it be done in flex?

Or can bison reject a token (e. g. a "num" after a "num" in the case of "1-2") 
and make flex using a different rule?

and i m still not sure, what to do with whitespaces (e.g. "- 3")...
currently i just ignore them...

What kind of grammars is bison capable of parsing?
I mean: Is my grammar too complicated for flex?

> So the grammar would no longer include `exp: "num"`.
>
u mean`exp: "-" "num"`?

> Your actions look quite badly typed.  And `std::endl` should seldom be used, 
> `'\n'` is enough.
>
I am concerned about
1. parser errors
2. memory leaks
3. efficiency of debug code... :-)

Thx.

-arne



Re: how to solve this reduce/reduce conflict?

2022-09-21 Thread Akim Demaille
Hi,

> Le 21 sept. 2022 à 23:31, Lukas Arsalan  a écrit :
> 
> exp:
>"-" "num"{ $$ = -*new Float($2); std::cout << "NUMinv" << $$ 
> << std::endl; }
> |  "num"{ $$ = new Float($1); std::cout << "num" << $$ << 
> std::endl; }
> |  "-" exp  { $$ = -*$2; std::cout << "inv" << $$ << std::endl; }

This snippet is clearly ambiguous, since it allows two different parses of -1, 
which -Wcex nicely showed.

If I were you, I would handle this in the scanner.  IOW, the scanner should be 
extended to support signed literals, and process that initial `-`.  So the 
grammar would no longer include `exp: "num"`.

Your actions look quite badly typed.  And `std::endl` should seldom be used, 
`'\n'` is enough.

Cheers!