Re: [Emc-developers] [Emc-users] question on gcode parsing

Michael Haberler Thu, 26 Jan 2012 14:37:54 -0800

Am 25.01.2012 um 22:27 schrieb Kenneth Lerman:

> On 1/25/2012 3:22 PM, Michael Haberler wrote:
>> [this should move to emc-developers, which is why I'm cc'ing there]
>> 
>> it just occured to me that a decent parser would give us the opportunity for 
>> a significant language simplification while retaining backwards 
>> compatibility.
>> 
>> An example for the current RS274NGC language with variable references, 
>> expressions and control structures:
>> 
>> ----------------------
>> #<var1>  = [#<foo>  + 1]
>> #<var2>  = 10
>> 
>> o#<label1>  if [#<var1>  lt #<var2>]
>>     ....
>> o#<label1>  else
>>     ....
>> o#<label1>  endif
>> ----------------------
>> 
>> Note the pathetic amount of syntactic noise - wouldnt it be more readable to 
>> write:
>> 
>> ----------------------
>> $var1 = $foo + 1
>> $var2 = 10
>> 
>> if $var1<  $var 2
>>      ...
>> else
>>         ...
>> endif
>> ----------------------
>> 
>> We have several noise chars per variable (#<>), useless labels including 
>> noise (o#<label1>) which do not help in disambiguating, and useless brackets 
>> around expressions, plus, well, fortranesque operators
>> 
>> now the major reason why this is so is that the current scanner only does 
>> lookahead 1 character, and the parser is inadeaquate; if even Perl can do 
>> it, so should RS274NGC
>> 
>> A combination of a say flex scanner, bison parser should be able to parse 
>> both examples unambiguously. Moreover, it should tell during the bison run 
>> wether there are any ambiguities or conflicts when such a language 
>> simplification is introduced - it would give a reduce/reduce message. For 
>> instance, one could experiment wether the '$' as variable introducer is 
>> actually necessary (it probably is due to ambiguities with words in a block).
>> 
>> I understand this is quite different from you pretty printer/lint goal
>> 
>> If we were to go about this, I think the way to do this is:
>> 
>> - have both parsers as alternatives
>> - add a flag to sai/rs274 to parse a file with old and new parser
>> - compare outputs for regression tests
>> - when it is clear that no ambiguities are left, move it to mainline as the 
>> default parser
>> 
>> - Michael


Hi Ken,

I'll write up my plans and the status on a wiki page, so I'll comment just 
minimally here

the whole idea came about a 'lint' type tool which would have started as a 
separate tool. That is a valid idea, but I dont think maintaining two parsers 
for the same language is a great idea longer term - they are bound to diverge, 
so I thought about integrating. I didnt suggest deprecating language elements, 
see first sentence; which is why I proposed the backwards checking capability 
at the end of my message.

An upside of a parser generator approach would be that major syntactic issues 
are detected early - at parser generation time. I agree that's not a panacea - 
adding good error recovery productions in a yacc-type parser is kind of a black 
art, and it makes all the difference between a usable parser, and a 'Compilers 
101' type effort. Old versions of yacc needed some extra massage to produce the 
set of expected symbols on error, for good diagnostics. WRT error messages, I 
assume you refer to parser generator error messages since the syntax error 
messages would be user-defined. Unfortunately, a parser generator doesnt help 
with lexical ambiguities, like the one due to squashing white space; it does 
help signficantly to use f/lex scanner start states derived from parser state.

re source location/line numbers: 
I agree - since line numbers arent unique anymore in the presence of several 
ngc files, this breaks RFL, see for instance bug #3440704 which cannot be 
resolved given current runtime support. The current state of line number feed 
back in UI's has lots of room for improvement. I have started on source 
location (which would be a unique file id/line number pair) in my next 
iteration of breakage attempts.

The call stack is a nice idea and not that hard to do - noted.

re MDI:

that's a sore point and constant source of trouble, and another example where 
the original control structure outlived its usefulness - without control 
structures, having task call upon interp to read and execute blocks in turn 
looked viable. With control structures, trouble came - the indicators are: 
several state variables in interpreter and task, and a 'side queue' in task. 
Which is odd, because the interpreter really has no business understanding 
anything about 'MDI mode'.

The current approach wrt MDI is a bit like this: you have a parser, you feed it 
a line at a time, and then try to figure somehow what the poor thing's state 
is. Also, the idea that task tries to 'infer' state by observing the 
interpreter (waiting, reading etc) is a bit odd - it could tell by itself, 
without guesswork.

my approach would be to drop the 'interpreter as a subroutine' concept and make 
it a thread - it would be fed strings, or files, and it would tell by  
callbacks when something interesting happens/is needed, eg. a sync(), step 
complete, a state update, program end, error, the like. with this approach you 
could even type control structures at the MDI level, line by line, and have it 
execute correctly - the interpreter doesnt know the difference between MDI and 
run, and need not know it. And it could be quite helpful at the interactive 
level: the interpreter could tell a) that some more input is needed and even b) 
what symbols are permitted.

re evolving the language:

I see the upside, but I'm undecided about the 'remove squashing whitespace' 
idea. Anybody going about it better have some consensus behind him before 
starting or face frustration at merge time, but that belongs into a different 
thread.

If one were to go about it again the a 'dual parser standalone interpreter' 
would help in verifying backwards compatibility and pointing out issues.

re jog forwards/backwards:
I'd be interested to hear your ideas in more detail. It comes up regularly in 
different shapes or forms, and it occurs to me that would need motion support. 
How can the interpreter help here? And what kind of context would GUI's have to 
carry? motion information?

regards

- Michael

> Hi Michael,
> 
> The present language evolved (although some might say "devolved") from 
> RS274NGC. As it was developed in an incremental manner (mostly by me), I 
> had several constraints. The only way I could reasonable expect to have 
> my changes accepted, was to have every existing program had to be a 
> legal program in the new language that did exactly the same thing.
> 
> I agree that we have a large amount of lexical noise in the language. In 
> particular, I took a lazy approach to matching 'if', 'else', and 'endif' 
> (and also loop beginnings and endings). I needed to have a method of 
> providing labels, anyway, because subroutines needed names.
> 
> When I added named variables, I needed some way of determining the end 
> of a variable name. Because of the nasty "feature" of the language that 
> removes whitespace, we chose to bracket the name, instead. We could, of 
> course, use '#anyOldName>' or '#anyOldName$', but at the time it seemed 
> that the extra '<' at the beginning would leave the code looking a 
> little better.
> 
> I knew at the time I made these changes that they were "less than 
> optimal" (although since it was my choice, it would probably be OK for 
> me to say "crappy"). To my mind, those choices were a better alternative 
> to doing nothing. I suspect that the users of o-words and named 
> variables would agree. After all, if you don't like them, you don't have 
> to use them.
> 
> To me, a more significant issue than the language is the set of issues 
> concerning such things as calling subroutines from MDI, and stopping and 
> continuing from a specified line. At the time I wrote the code, I 
> recognized that these were issues, but again, I felt it was better than 
> nothing. (Again, if you don't like the way a new feature works, don't 
> use it.)
> 
> Enough about history, though. Addressing the future of the language (and 
> its interpreter):
> 
> 1 -- I still believe that every valid RS274NGC program when run with the 
> RS274NGC.new interpreter should do exactly the same thing. I would keep 
> the fortranesque (to use Michael's term) operators.
> 
> 2 -- I believe that we can (and should) eliminate the requirement to use 
> o-words where they are used solely to match control structure 
> components. It should not be an error to include them. We might want to 
> permit subroutines to be declared with "sub name" instead of "o<name> 
> sub". We should eliminate the removal of whitespace and use whitespace 
> as a delimiter. Then "X123" would be the same as "X 123", but "XA YB" 
> would mean "X#<A> Y#<B>" while "XAYB" would mean "X#<AYB>".  [clearly 
> this idea needs a lot more thought]
> 
> 3 -- My experience with lex (or flex) and bison (or yacc) is somewhat 
> mixed. I believe that the nature of bison makes error handling difficult 
> and error messages vague and awkward. Personally, I prefer to hang 
> generate recursive descent parsers. I haven't played with antlr and 
> other modern tools enough to know how they would fare.
> 
> 4 -- We've come a long way since the original EMC code was written. In 
> particular, we have lots more memory. I believe that the interface 
> between the interpreter and the executor should convey a lot more data. 
> Among the goals I would suggest is the ability to jog along the 
> execution path in a forwards or backwards direction. I would suggest 
> that doing this properly might also require Axis (and other GUIs) to 
> carry a lot more context. For a user to know what is happening, a copy 
> of the execution stack should be displayable. After all, a line number 
> in a subroutine doesn't tell you where you are if you don't know where 
> you were called from.
> ======
> 
> In the past, I've put my suggested changes up on the Wiki. I found that 
> to be a decent forum for keeping a history of a discussion. I suggest 
> that we try to do something like that in the future.
> 
> Regards,
> 
> Ken
>> 
>> 
>> ------------------------------------------------------------------------------
>> Keep Your Developer Skills Current with LearnDevNow!
>> The most comprehensive online learning library for Microsoft developers
>> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
>> Metro Style Apps, more. Free future releases when you subscribe now!
>> http://p.sf.net/sfu/learndevnow-d2d
>> _______________________________________________
>> Emc-developers mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/emc-developers
> 
> ------------------------------------------------------------------------------
> Keep Your Developer Skills Current with LearnDevNow!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-d2d
> _______________________________________________
> Emc-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/emc-developers


------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Emc-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/emc-developers

Re: [Emc-developers] [Emc-users] question on gcode parsing

Reply via email to