Re: [PATCH] DTC: Remove the need for the GLR Parser.
So, like, the other day David Gibson mumbled: On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote: Previously, there were a few shift/reduce and reduce/reduce errors in the grammar that were being handled by the not-so-popular GLR Parser technique. I haven't actually heard anyone whinge about glr-parser... I have. :-) Flip a right-recursive stack-abusing rule into a left-recursive stack-friendly rule and clear up three messes in one shot: No more conflicts, no need for the GLR parser, and friendlier stackness. Ouch. I'm feeling a bit stupid now, Absolutely no need for that. I really thought our conflicts were somewhere else. Specifically I thought the problem was that we needed to look ahead more tokens that we were able to differentiate between property and subnode definitions, i.e. between: label propname = and label propname { Yes, it was. When you compute the closure of the propdef with the rule as a right-recursive, that's when you get the conflict. Well, regardless of that, I have a few concerns. First, a trivial one: I remember leaving this as a right-recursion, despite the stack-nastiness, because that way the properties end up in the same order as in the source. I think that behaviour is worth preserving, but of course we can do it with left-recursion by changing chain_property() to add to the end of the list instead of the beginning. Understood. And I wrestled with that as well. In fact, I even wrote the reverse_properties() function, which I will include, and used it initially. However, several test failed. So I removed it, and it all started happily working again. Also, if we're going to avoid right-recursion here, we should do so for the 'subnodes' productions as well, which is completely analogous. Well, isn't _that_ an interesting observation... :-) I'll add it to the grand DTC To Do list. More significantly, I don't know that we want to burn our bridges with glr-parser. glr-parser is a beautiful algorithm which means we can use essentially whatever form of grammar is the easiest to work with without having to fiddle about to ensure it's LALR(1). This could still be useful if we encounter some less easily finessable grammar construct in future. I'm not saying we can't use it in the future, as needed! I'm just saying it isn't strictly necessary now. And even without glr-parser, I'm still uncomfortable with the lexer-parser execution ordering issues with the current /dts-version/ proposal. It may now be true that the order is guaranteed to be correct, but it's still not exactly obvious. I'm fine with it, and though I read your words, I'm not really sure why you are not In the long term, maybe think of it as a temporary hack then. We'll convert the existing DTS files over to the new version, and then deprecate the dts-version 0 files and support and it will all go away relatively soon. It seems to me that the version change introduces a lexical change to the input format, and should therefore be handled at the lexical level. And I think there are other potential advantages to parsing the version identifier as a token, rather than as an integer (such as being able to define entirely different grammars for different versions, if we have to). Version symbol or number, that's still not precluded, I don't think. jdl ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] DTC: Remove the need for the GLR Parser.
So, like, the other day Jon Loeliger mumbled: First, a trivial one: I remember leaving this as a right-recursion, despite the stack-nastiness, because that way the properties end up in the same order as in the source. I think that behaviour is worth preserving, but of course we can do it with left-recursion by changing chain_property() to add to the end of the list instead of the beginning. Understood. And I wrestled with that as well. In fact, I even wrote the reverse_properties() function, which I will include, and used it initially. However, several test failed. So I removed it, and it all started happily working again. I was confused. It was the version of the code that _did_ use the property reversal that worked. Even Milton's asm test with labels worked. :-) jdl ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] DTC: Remove the need for the GLR Parser.
On Tue, Oct 23, 2007 at 04:41:51PM +0200, Segher Boessenkool wrote: Flip a right-recursive stack-abusing rule into a left-recursive stack-friendly rule and clear up three messes in one shot: No more conflicts, no need for the GLR parser, and friendlier stackness. Ouch. I'm feeling a bit stupid now, Absolutely no need for that. If you haven't had exp := aexp | exp aexp beaten into you with a big stick, maybe you should be happy about that ('s got a nail in it!) :-) And even without glr-parser, I'm still uncomfortable with the lexer-parser execution ordering issues with the current /dts-version/ proposal. It may now be true that the order is guaranteed to be correct, but it's still not exactly obvious. If you require /dts-version/ (and similar global dtc-control stmts) to be at the start of the file, can't you avoid this ordering problem by starting to parse the file with a simple (hand-written) parser (which would handle these statements) and only when you cannot parse any more switch to the normal parser (which won't handle them)? Or is this a stupid suggestion :-) Aieee, the pain! No, please let's keep all the grammar information in one place. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] DTC: Remove the need for the GLR Parser.
On Tue, Oct 23, 2007 at 09:49:09AM -0500, Jon Loeliger wrote: So, like, the other day Segher Boessenkool mumbled: And even without glr-parser, I'm still uncomfortable with the lexer-parser execution ordering issues with the current /dts-version/ proposal. It may now be true that the order is guaranteed to be correct, but it's still not exactly obvious. If you require /dts-version/ (and similar global dtc-control stmts) to be at the start of the file, can't you avoid this ordering problem by starting to parse the file with a simple (hand-written) parser (which would handle these statements) and only when you cannot parse any more switch to the normal parser (which won't handle them)? Or is this a stupid suggestion :-) My concern here is that I think we are attempting to make this significantly more complex than it needs to be. We have a pretty good KISS opportunity here, and I've shown it to be working so far. Longer term, all this cruft will go away and it will become a moot point anyway. Well, so am I. And I think my alternative suggestion (use '/dts-version-1/') is even simpler: it's recognized in the lexer, so it's trivial and clear how it can affect both lexical processing and parsing. Using a different token here means the version can affect the whole structure of the grammar, without hard to follow interactions between global variables and parsing behavior (for instance we can fairly easily grammatically ban the /memreserve/ a-b form in v1 files). And I don't see *any* advantage to parsing the version as an integer rather than a token. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] DTC: Remove the need for the GLR Parser.
On Tue, Oct 23, 2007 at 11:07:39AM -0500, Jon Loeliger wrote: So, like, the other day Jon Loeliger mumbled: First, a trivial one: I remember leaving this as a right-recursion, despite the stack-nastiness, because that way the properties end up in the same order as in the source. I think that behaviour is worth preserving, but of course we can do it with left-recursion by changing chain_property() to add to the end of the list instead of the beginning. Understood. And I wrestled with that as well. In fact, I even wrote the reverse_properties() function, which I will include, and used it initially. However, several test failed. So I removed it, and it all started happily working again. I was confused. It was the version of the code that _did_ use the property reversal that worked. Ok. Except that I think we shouldn't need to have an explicit reverse_properties(). Just build the list in-order by having chain_property() append to the end of the list. It's theoretically expensive, since we have to walk the list each time, but frankly I don't think we need to worry about dtc performance until we actually see a single non-contrived tree that takes a noticeable amount of time to process: I've never seen one yet. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] DTC: Remove the need for the GLR Parser.
On Tue, Oct 23, 2007 at 09:24:52AM -0500, Jon Loeliger wrote: So, like, the other day David Gibson mumbled: On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote: [snip] I really thought our conflicts were somewhere else. Specifically I thought the problem was that we needed to look ahead more tokens that we were able to differentiate between property and subnode definitions, i.e. between: label propname = and label propname { Yes, it was. When you compute the closure of the propdef with the rule as a right-recursive, that's when you get the conflict. Ah, right. Been too long since studied LR parsers, obviously. [snip] More significantly, I don't know that we want to burn our bridges with glr-parser. glr-parser is a beautiful algorithm which means we can use essentially whatever form of grammar is the easiest to work with without having to fiddle about to ensure it's LALR(1). This could still be useful if we encounter some less easily finessable grammar construct in future. I'm not saying we can't use it in the future, as needed! I'm just saying it isn't strictly necessary now. Sure. Sorry, I was unclear; I wasn't saying we shouldn't remove the glr-parser option now: certainly we should remove that right-recursion, and then we don't need it. I'm saying we shouldn't do other things which would preclude bringing it back. And even without glr-parser, I'm still uncomfortable with the lexer-parser execution ordering issues with the current /dts-version/ proposal. It may now be true that the order is guaranteed to be correct, but it's still not exactly obvious. I'm fine with it, and though I read your words, I'm not really sure why you are not In the long term, maybe think of it as a temporary hack then. We'll convert the existing DTS files over to the new version, and then deprecate the dts-version 0 files and support and it will all go away relatively soon. I just think that putting the version number into the token is simpler, clearer and has no disadvantage compared to a version integer. And will remain so if we have to bump the version again. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] DTC: Remove the need for the GLR Parser.
On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote: Previously, there were a few shift/reduce and reduce/reduce errors in the grammar that were being handled by the not-so-popular GLR Parser technique. I haven't actually heard anyone whinge about glr-parser... Flip a right-recursive stack-abusing rule into a left-recursive stack-friendly rule and clear up three messes in one shot: No more conflicts, no need for the GLR parser, and friendlier stackness. Ouch. I'm feeling a bit stupid now, I really thought our conflicts were somewhere else. Specifically I thought the problem was that we needed to look ahead more tokens that we were able to differentiate between property and subnode definitions, i.e. between: label propname = and label propname { Except... I'm almost certain the conflicts first appeared when I added labels, and I can't see how that would affect this. Well, colour me baffled. Especially since the comments and content of commit 4102d840d993e7cce7d5c5aea8ef696dc81236fc (second commit in the entire history!) appear to back up my memory of this. I used to have a lookahead hack in the lexer to remove the conflict. But this patch certainly seems to make the conflicts go away, so I'm confused. Well, regardless of that, I have a few concerns. First, a trivial one: I remember leaving this as a right-recursion, despite the stack-nastiness, because that way the properties end up in the same order as in the source. I think that behaviour is worth preserving, but of course we can do it with left-recursion by changing chain_property() to add to the end of the list instead of the beginning. Also, if we're going to avoid right-recursion here, we should do so for the 'subnodes' productions as well, which is completely analogous. More significantly, I don't know that we want to burn our bridges with glr-parser. glr-parser is a beautiful algorithm which means we can use essentially whatever form of grammar is the easiest to work with without having to fiddle about to ensure it's LALR(1). This could still be useful if we encounter some less easily finessable grammar construct in future. And even without glr-parser, I'm still uncomfortable with the lexer-parser execution ordering issues with the current /dts-version/ proposal. It may now be true that the order is guaranteed to be correct, but it's still not exactly obvious. It seems to me that the version change introduces a lexical change to the input format, and should therefore be handled at the lexical level. And I think there are other potential advantages to parsing the version identifier as a token, rather than as an integer (such as being able to define entirely different grammars for different versions, if we have to). -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev