Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-22 Thread David Gibson
On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote:
> Previously, there were a few shift/reduce and reduce/reduce
> errors in the grammar that were being handled by the not-so-popular
> GLR Parser technique.

I haven't actually heard anyone whinge about glr-parser...

> Flip a right-recursive stack-abusing rule into a left-recursive
> stack-friendly rule and clear up three messes in one shot: No more
> conflicts, no need for the GLR parser, and friendlier stackness.

Ouch.  I'm feeling a bit stupid now, I really thought our conflicts
were somewhere else.  Specifically I thought the problem was that we
needed to look ahead more tokens that we were able to differentiate
between property and subnode definitions, i.e. between:
label propname =
and
label propname {

Except... I'm almost certain the conflicts first appeared when I added
labels, and I can't see how that would affect this.  Well, colour me
baffled.

Especially since the comments and content of commit
4102d840d993e7cce7d5c5aea8ef696dc81236fc (second commit in the entire
history!) appear to back up my memory of this.  I used to have a
lookahead hack in the lexer to remove the conflict.

But this patch certainly seems to make the conflicts go away, so I'm
confused.


Well, regardless of that, I have a few concerns.

First, a trivial one: I remember leaving this as a right-recursion,
despite the stack-nastiness, because that way the properties end up in
the same order as in the source.  I think that behaviour is worth
preserving, but of course we can do it with left-recursion by changing
chain_property() to add to the end of the list instead of the
beginning.  Also, if we're going to avoid right-recursion here, we
should do so for the 'subnodes' productions as well, which is
completely analogous.

More significantly, I don't know that we want to burn our bridges with
glr-parser.  glr-parser is a beautiful algorithm which means we can
use essentially whatever form of grammar is the easiest to work with
without having to fiddle about to ensure it's LALR(1).  This could
still be useful if we encounter some less easily finessable grammar
construct in future.

And even without glr-parser, I'm still uncomfortable with the
lexer<->parser execution ordering issues with the current
/dts-version/ proposal.  It may now be true that the order is
guaranteed to be correct, but it's still not exactly obvious.

It seems to me that the version change introduces a lexical change to
the input format, and should therefore be handled at the lexical
level.  And I think there are other potential advantages to parsing
the version identifier as a token, rather than as an integer (such as
being able to define entirely different grammars for different
versions, if we have to).

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread Jon Loeliger
So, like, the other day David Gibson mumbled:
> On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote:
> > Previously, there were a few shift/reduce and reduce/reduce
> > errors in the grammar that were being handled by the not-so-popular
> > GLR Parser technique.
> 
> I haven't actually heard anyone whinge about glr-parser...

I have. :-)

> > Flip a right-recursive stack-abusing rule into a left-recursive
> > stack-friendly rule and clear up three messes in one shot: No more
> > conflicts, no need for the GLR parser, and friendlier stackness.
> 
> Ouch.  I'm feeling a bit stupid now,

Absolutely no need for that.

> I really thought our conflicts
> were somewhere else.  Specifically I thought the problem was that we
> needed to look ahead more tokens that we were able to differentiate
> between property and subnode definitions, i.e. between:
>   label propname =
> and
>   label propname {

Yes, it was.  When you compute the closure of the propdef with
the rule as a right-recursive, that's when you get the conflict.


> Well, regardless of that, I have a few concerns.
> 
> First, a trivial one: I remember leaving this as a right-recursion,
> despite the stack-nastiness, because that way the properties end up in
> the same order as in the source.  I think that behaviour is worth
> preserving, but of course we can do it with left-recursion by changing
> chain_property() to add to the end of the list instead of the
> beginning.

Understood.  And I wrestled with that as well.  In fact, I even
wrote the reverse_properties() function, which I will include,
and used it initially.  However, several test failed.  So I
removed it, and it all started happily working again.

> Also, if we're going to avoid right-recursion here, we
> should do so for the 'subnodes' productions as well, which is
> completely analogous.

Well, isn't _that_ an interesting observation... :-)
I'll add it to the grand "DTC To Do" list.

> More significantly, I don't know that we want to burn our bridges with
> glr-parser.  glr-parser is a beautiful algorithm which means we can
> use essentially whatever form of grammar is the easiest to work with
> without having to fiddle about to ensure it's LALR(1).  This could
> still be useful if we encounter some less easily finessable grammar
> construct in future.

I'm not saying we can't use it in the future, as needed!  I'm just
saying it isn't strictly necessary now.

> And even without glr-parser, I'm still uncomfortable with the
> lexer<->parser execution ordering issues with the current
> /dts-version/ proposal.  It may now be true that the order is
> guaranteed to be correct, but it's still not exactly obvious.

I'm fine with it, and though I read your words, I'm not really
sure why you are not  In the long term, maybe think of it
as a temporary hack then.  We'll convert the existing DTS files
over to the new version, and then deprecate the "dts-version 0"
files and support and it will all go away relatively soon.


> It seems to me that the version change introduces a lexical change to
> the input format, and should therefore be handled at the lexical
> level.  And I think there are other potential advantages to parsing
> the version identifier as a token, rather than as an integer (such as
> being able to define entirely different grammars for different
> versions, if we have to).

Version symbol or number, that's still not precluded, I don't think.

jdl
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread Segher Boessenkool
>>> Flip a right-recursive stack-abusing rule into a left-recursive
>>> stack-friendly rule and clear up three messes in one shot: No more
>>> conflicts, no need for the GLR parser, and friendlier stackness.
>>
>> Ouch.  I'm feeling a bit stupid now,
>
> Absolutely no need for that.

If you haven't had "exp := aexp | exp aexp" beaten into you with
a big stick, maybe you should be happy about that ("'s got a nail
in it!") :-)

>> And even without glr-parser, I'm still uncomfortable with the
>> lexer<->parser execution ordering issues with the current
>> /dts-version/ proposal.  It may now be true that the order is
>> guaranteed to be correct, but it's still not exactly obvious.

If you require /dts-version/ (and similar global dtc-control stmts)
to be at the start of the file, can't you avoid this ordering problem
by starting to parse the file with a simple (hand-written) parser
(which would handle these statements) and only when you cannot parse
any more switch to the "normal" parser (which won't handle them)?
Or is this a stupid suggestion :-)


Segher

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread Jon Loeliger
So, like, the other day Segher Boessenkool mumbled:
> 
> >> And even without glr-parser, I'm still uncomfortable with the
> >> lexer<->parser execution ordering issues with the current
> >> /dts-version/ proposal.  It may now be true that the order is
> >> guaranteed to be correct, but it's still not exactly obvious.
> 
> If you require /dts-version/ (and similar global dtc-control stmts)
> to be at the start of the file, can't you avoid this ordering problem
> by starting to parse the file with a simple (hand-written) parser
> (which would handle these statements) and only when you cannot parse
> any more switch to the "normal" parser (which won't handle them)?
> Or is this a stupid suggestion :-)

My concern here is that I think we are attempting to make
this significantly more complex than it needs to be.
We have a pretty good KISS opportunity here, and I've shown
it to be working so far.  Longer term, all this cruft will
go away and it will become a moot point anyway.

jdl
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread Jon Loeliger
So, like, the other day Jon Loeliger mumbled:
> 
> > First, a trivial one: I remember leaving this as a right-recursion,
> > despite the stack-nastiness, because that way the properties end up in
> > the same order as in the source.  I think that behaviour is worth
> > preserving, but of course we can do it with left-recursion by changing
> > chain_property() to add to the end of the list instead of the
> > beginning.
> 
> Understood.  And I wrestled with that as well.  In fact, I even
> wrote the reverse_properties() function, which I will include,
> and used it initially.  However, several test failed.  So I
> removed it, and it all started happily working again.

I was confused.  It was the version of the code that _did_
use the property reversal that worked.

Even Milton's asm test with labels worked. :-)

jdl
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread David Gibson
On Tue, Oct 23, 2007 at 04:41:51PM +0200, Segher Boessenkool wrote:
> >>> Flip a right-recursive stack-abusing rule into a left-recursive
> >>> stack-friendly rule and clear up three messes in one shot: No more
> >>> conflicts, no need for the GLR parser, and friendlier stackness.
> >>
> >> Ouch.  I'm feeling a bit stupid now,
> >
> > Absolutely no need for that.
> 
> If you haven't had "exp := aexp | exp aexp" beaten into you with
> a big stick, maybe you should be happy about that ("'s got a nail
> in it!") :-)
> 
> >> And even without glr-parser, I'm still uncomfortable with the
> >> lexer<->parser execution ordering issues with the current
> >> /dts-version/ proposal.  It may now be true that the order is
> >> guaranteed to be correct, but it's still not exactly obvious.
> 
> If you require /dts-version/ (and similar global dtc-control stmts)
> to be at the start of the file, can't you avoid this ordering problem
> by starting to parse the file with a simple (hand-written) parser
> (which would handle these statements) and only when you cannot parse
> any more switch to the "normal" parser (which won't handle them)?
> Or is this a stupid suggestion :-)

Aieee, the pain!  No, please let's keep all the grammar information in
one place.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread David Gibson
On Tue, Oct 23, 2007 at 09:49:09AM -0500, Jon Loeliger wrote:
> So, like, the other day Segher Boessenkool mumbled:
> > 
> > >> And even without glr-parser, I'm still uncomfortable with the
> > >> lexer<->parser execution ordering issues with the current
> > >> /dts-version/ proposal.  It may now be true that the order is
> > >> guaranteed to be correct, but it's still not exactly obvious.
> > 
> > If you require /dts-version/ (and similar global dtc-control stmts)
> > to be at the start of the file, can't you avoid this ordering problem
> > by starting to parse the file with a simple (hand-written) parser
> > (which would handle these statements) and only when you cannot parse
> > any more switch to the "normal" parser (which won't handle them)?
> > Or is this a stupid suggestion :-)
> 
> My concern here is that I think we are attempting to make
> this significantly more complex than it needs to be.
> We have a pretty good KISS opportunity here, and I've shown
> it to be working so far.  Longer term, all this cruft will
> go away and it will become a moot point anyway.

Well, so am I.  And I think my alternative suggestion (use
'/dts-version-1/') is even simpler: it's recognized in the lexer, so
it's trivial and clear how it can affect both lexical processing and
parsing.  Using a different token here means the "version" can affect
the whole structure of the grammar, without hard to follow
interactions between global variables and parsing behavior (for
instance we can fairly easily grammatically ban the /memreserve/ a-b
form in v1 files).

And I don't see *any* advantage to parsing the version as an integer
rather than a token.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread David Gibson
On Tue, Oct 23, 2007 at 11:07:39AM -0500, Jon Loeliger wrote:
> So, like, the other day Jon Loeliger mumbled:
> > 
> > > First, a trivial one: I remember leaving this as a right-recursion,
> > > despite the stack-nastiness, because that way the properties end up in
> > > the same order as in the source.  I think that behaviour is worth
> > > preserving, but of course we can do it with left-recursion by changing
> > > chain_property() to add to the end of the list instead of the
> > > beginning.
> > 
> > Understood.  And I wrestled with that as well.  In fact, I even
> > wrote the reverse_properties() function, which I will include,
> > and used it initially.  However, several test failed.  So I
> > removed it, and it all started happily working again.
> 
> I was confused.  It was the version of the code that _did_
> use the property reversal that worked.

Ok.  Except that I think we shouldn't need to have an explicit
reverse_properties().  Just build the list in-order by having
chain_property() append to the end of the list.

It's theoretically expensive, since we have to walk the list each
time, but frankly I don't think we need to worry about dtc performance
until we actually see a single non-contrived tree that takes a
noticeable amount of time to process:  I've never seen one yet.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread David Gibson
On Tue, Oct 23, 2007 at 09:24:52AM -0500, Jon Loeliger wrote:
> So, like, the other day David Gibson mumbled:
> > On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote:
[snip]
> > I really thought our conflicts
> > were somewhere else.  Specifically I thought the problem was that we
> > needed to look ahead more tokens that we were able to differentiate
> > between property and subnode definitions, i.e. between:
> > label propname =
> > and
> > label propname {
> 
> Yes, it was.  When you compute the closure of the propdef with
> the rule as a right-recursive, that's when you get the conflict.

Ah, right.  Been too long since studied LR parsers, obviously.

[snip]
> > More significantly, I don't know that we want to burn our bridges with
> > glr-parser.  glr-parser is a beautiful algorithm which means we can
> > use essentially whatever form of grammar is the easiest to work with
> > without having to fiddle about to ensure it's LALR(1).  This could
> > still be useful if we encounter some less easily finessable grammar
> > construct in future.
> 
> I'm not saying we can't use it in the future, as needed!  I'm just
> saying it isn't strictly necessary now.

Sure.  Sorry, I was unclear; I wasn't saying we shouldn't remove the
glr-parser option now: certainly we should remove that
right-recursion, and then we don't need it.  I'm saying we shouldn't
do other things which would preclude bringing it back.

> > And even without glr-parser, I'm still uncomfortable with the
> > lexer<->parser execution ordering issues with the current
> > /dts-version/ proposal.  It may now be true that the order is
> > guaranteed to be correct, but it's still not exactly obvious.
> 
> I'm fine with it, and though I read your words, I'm not really
> sure why you are not  In the long term, maybe think of it
> as a temporary hack then.  We'll convert the existing DTS files
> over to the new version, and then deprecate the "dts-version 0"
> files and support and it will all go away relatively soon.

I just think that putting the version number into the token is
simpler, clearer and has no disadvantage compared to a version
integer.  And will remain so if we have to bump the version again.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev