Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread Jon Loeliger
So, like, the other day David Gibson mumbled:
 On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote:
  Previously, there were a few shift/reduce and reduce/reduce
  errors in the grammar that were being handled by the not-so-popular
  GLR Parser technique.
 
 I haven't actually heard anyone whinge about glr-parser...

I have. :-)

  Flip a right-recursive stack-abusing rule into a left-recursive
  stack-friendly rule and clear up three messes in one shot: No more
  conflicts, no need for the GLR parser, and friendlier stackness.
 
 Ouch.  I'm feeling a bit stupid now,

Absolutely no need for that.

 I really thought our conflicts
 were somewhere else.  Specifically I thought the problem was that we
 needed to look ahead more tokens that we were able to differentiate
 between property and subnode definitions, i.e. between:
   label propname =
 and
   label propname {

Yes, it was.  When you compute the closure of the propdef with
the rule as a right-recursive, that's when you get the conflict.


 Well, regardless of that, I have a few concerns.
 
 First, a trivial one: I remember leaving this as a right-recursion,
 despite the stack-nastiness, because that way the properties end up in
 the same order as in the source.  I think that behaviour is worth
 preserving, but of course we can do it with left-recursion by changing
 chain_property() to add to the end of the list instead of the
 beginning.

Understood.  And I wrestled with that as well.  In fact, I even
wrote the reverse_properties() function, which I will include,
and used it initially.  However, several test failed.  So I
removed it, and it all started happily working again.

 Also, if we're going to avoid right-recursion here, we
 should do so for the 'subnodes' productions as well, which is
 completely analogous.

Well, isn't _that_ an interesting observation... :-)
I'll add it to the grand DTC To Do list.

 More significantly, I don't know that we want to burn our bridges with
 glr-parser.  glr-parser is a beautiful algorithm which means we can
 use essentially whatever form of grammar is the easiest to work with
 without having to fiddle about to ensure it's LALR(1).  This could
 still be useful if we encounter some less easily finessable grammar
 construct in future.

I'm not saying we can't use it in the future, as needed!  I'm just
saying it isn't strictly necessary now.

 And even without glr-parser, I'm still uncomfortable with the
 lexer-parser execution ordering issues with the current
 /dts-version/ proposal.  It may now be true that the order is
 guaranteed to be correct, but it's still not exactly obvious.

I'm fine with it, and though I read your words, I'm not really
sure why you are not  In the long term, maybe think of it
as a temporary hack then.  We'll convert the existing DTS files
over to the new version, and then deprecate the dts-version 0
files and support and it will all go away relatively soon.


 It seems to me that the version change introduces a lexical change to
 the input format, and should therefore be handled at the lexical
 level.  And I think there are other potential advantages to parsing
 the version identifier as a token, rather than as an integer (such as
 being able to define entirely different grammars for different
 versions, if we have to).

Version symbol or number, that's still not precluded, I don't think.

jdl
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread Jon Loeliger
So, like, the other day Jon Loeliger mumbled:
 
  First, a trivial one: I remember leaving this as a right-recursion,
  despite the stack-nastiness, because that way the properties end up in
  the same order as in the source.  I think that behaviour is worth
  preserving, but of course we can do it with left-recursion by changing
  chain_property() to add to the end of the list instead of the
  beginning.
 
 Understood.  And I wrestled with that as well.  In fact, I even
 wrote the reverse_properties() function, which I will include,
 and used it initially.  However, several test failed.  So I
 removed it, and it all started happily working again.

I was confused.  It was the version of the code that _did_
use the property reversal that worked.

Even Milton's asm test with labels worked. :-)

jdl
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread David Gibson
On Tue, Oct 23, 2007 at 04:41:51PM +0200, Segher Boessenkool wrote:
  Flip a right-recursive stack-abusing rule into a left-recursive
  stack-friendly rule and clear up three messes in one shot: No more
  conflicts, no need for the GLR parser, and friendlier stackness.
 
  Ouch.  I'm feeling a bit stupid now,
 
  Absolutely no need for that.
 
 If you haven't had exp := aexp | exp aexp beaten into you with
 a big stick, maybe you should be happy about that ('s got a nail
 in it!) :-)
 
  And even without glr-parser, I'm still uncomfortable with the
  lexer-parser execution ordering issues with the current
  /dts-version/ proposal.  It may now be true that the order is
  guaranteed to be correct, but it's still not exactly obvious.
 
 If you require /dts-version/ (and similar global dtc-control stmts)
 to be at the start of the file, can't you avoid this ordering problem
 by starting to parse the file with a simple (hand-written) parser
 (which would handle these statements) and only when you cannot parse
 any more switch to the normal parser (which won't handle them)?
 Or is this a stupid suggestion :-)

Aieee, the pain!  No, please let's keep all the grammar information in
one place.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread David Gibson
On Tue, Oct 23, 2007 at 09:49:09AM -0500, Jon Loeliger wrote:
 So, like, the other day Segher Boessenkool mumbled:
  
   And even without glr-parser, I'm still uncomfortable with the
   lexer-parser execution ordering issues with the current
   /dts-version/ proposal.  It may now be true that the order is
   guaranteed to be correct, but it's still not exactly obvious.
  
  If you require /dts-version/ (and similar global dtc-control stmts)
  to be at the start of the file, can't you avoid this ordering problem
  by starting to parse the file with a simple (hand-written) parser
  (which would handle these statements) and only when you cannot parse
  any more switch to the normal parser (which won't handle them)?
  Or is this a stupid suggestion :-)
 
 My concern here is that I think we are attempting to make
 this significantly more complex than it needs to be.
 We have a pretty good KISS opportunity here, and I've shown
 it to be working so far.  Longer term, all this cruft will
 go away and it will become a moot point anyway.

Well, so am I.  And I think my alternative suggestion (use
'/dts-version-1/') is even simpler: it's recognized in the lexer, so
it's trivial and clear how it can affect both lexical processing and
parsing.  Using a different token here means the version can affect
the whole structure of the grammar, without hard to follow
interactions between global variables and parsing behavior (for
instance we can fairly easily grammatically ban the /memreserve/ a-b
form in v1 files).

And I don't see *any* advantage to parsing the version as an integer
rather than a token.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread David Gibson
On Tue, Oct 23, 2007 at 11:07:39AM -0500, Jon Loeliger wrote:
 So, like, the other day Jon Loeliger mumbled:
  
   First, a trivial one: I remember leaving this as a right-recursion,
   despite the stack-nastiness, because that way the properties end up in
   the same order as in the source.  I think that behaviour is worth
   preserving, but of course we can do it with left-recursion by changing
   chain_property() to add to the end of the list instead of the
   beginning.
  
  Understood.  And I wrestled with that as well.  In fact, I even
  wrote the reverse_properties() function, which I will include,
  and used it initially.  However, several test failed.  So I
  removed it, and it all started happily working again.
 
 I was confused.  It was the version of the code that _did_
 use the property reversal that worked.

Ok.  Except that I think we shouldn't need to have an explicit
reverse_properties().  Just build the list in-order by having
chain_property() append to the end of the list.

It's theoretically expensive, since we have to walk the list each
time, but frankly I don't think we need to worry about dtc performance
until we actually see a single non-contrived tree that takes a
noticeable amount of time to process:  I've never seen one yet.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-23 Thread David Gibson
On Tue, Oct 23, 2007 at 09:24:52AM -0500, Jon Loeliger wrote:
 So, like, the other day David Gibson mumbled:
  On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote:
[snip]
  I really thought our conflicts
  were somewhere else.  Specifically I thought the problem was that we
  needed to look ahead more tokens that we were able to differentiate
  between property and subnode definitions, i.e. between:
  label propname =
  and
  label propname {
 
 Yes, it was.  When you compute the closure of the propdef with
 the rule as a right-recursive, that's when you get the conflict.

Ah, right.  Been too long since studied LR parsers, obviously.

[snip]
  More significantly, I don't know that we want to burn our bridges with
  glr-parser.  glr-parser is a beautiful algorithm which means we can
  use essentially whatever form of grammar is the easiest to work with
  without having to fiddle about to ensure it's LALR(1).  This could
  still be useful if we encounter some less easily finessable grammar
  construct in future.
 
 I'm not saying we can't use it in the future, as needed!  I'm just
 saying it isn't strictly necessary now.

Sure.  Sorry, I was unclear; I wasn't saying we shouldn't remove the
glr-parser option now: certainly we should remove that
right-recursion, and then we don't need it.  I'm saying we shouldn't
do other things which would preclude bringing it back.

  And even without glr-parser, I'm still uncomfortable with the
  lexer-parser execution ordering issues with the current
  /dts-version/ proposal.  It may now be true that the order is
  guaranteed to be correct, but it's still not exactly obvious.
 
 I'm fine with it, and though I read your words, I'm not really
 sure why you are not  In the long term, maybe think of it
 as a temporary hack then.  We'll convert the existing DTS files
 over to the new version, and then deprecate the dts-version 0
 files and support and it will all go away relatively soon.

I just think that putting the version number into the token is
simpler, clearer and has no disadvantage compared to a version
integer.  And will remain so if we have to bump the version again.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] DTC: Remove the need for the GLR Parser.

2007-10-22 Thread David Gibson
On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote:
 Previously, there were a few shift/reduce and reduce/reduce
 errors in the grammar that were being handled by the not-so-popular
 GLR Parser technique.

I haven't actually heard anyone whinge about glr-parser...

 Flip a right-recursive stack-abusing rule into a left-recursive
 stack-friendly rule and clear up three messes in one shot: No more
 conflicts, no need for the GLR parser, and friendlier stackness.

Ouch.  I'm feeling a bit stupid now, I really thought our conflicts
were somewhere else.  Specifically I thought the problem was that we
needed to look ahead more tokens that we were able to differentiate
between property and subnode definitions, i.e. between:
label propname =
and
label propname {

Except... I'm almost certain the conflicts first appeared when I added
labels, and I can't see how that would affect this.  Well, colour me
baffled.

Especially since the comments and content of commit
4102d840d993e7cce7d5c5aea8ef696dc81236fc (second commit in the entire
history!) appear to back up my memory of this.  I used to have a
lookahead hack in the lexer to remove the conflict.

But this patch certainly seems to make the conflicts go away, so I'm
confused.


Well, regardless of that, I have a few concerns.

First, a trivial one: I remember leaving this as a right-recursion,
despite the stack-nastiness, because that way the properties end up in
the same order as in the source.  I think that behaviour is worth
preserving, but of course we can do it with left-recursion by changing
chain_property() to add to the end of the list instead of the
beginning.  Also, if we're going to avoid right-recursion here, we
should do so for the 'subnodes' productions as well, which is
completely analogous.

More significantly, I don't know that we want to burn our bridges with
glr-parser.  glr-parser is a beautiful algorithm which means we can
use essentially whatever form of grammar is the easiest to work with
without having to fiddle about to ensure it's LALR(1).  This could
still be useful if we encounter some less easily finessable grammar
construct in future.

And even without glr-parser, I'm still uncomfortable with the
lexer-parser execution ordering issues with the current
/dts-version/ proposal.  It may now be true that the order is
guaranteed to be correct, but it's still not exactly obvious.

It seems to me that the version change introduces a lexical change to
the input format, and should therefore be handled at the lexical
level.  And I think there are other potential advantages to parsing
the version identifier as a token, rather than as an integer (such as
being able to define entirely different grammars for different
versions, if we have to).

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev