Re: [Emc-users] question on gcode parsing

Erik Christiansen Sun, 29 Jan 2012 04:31:45 -0800

On 29.01.12 09:31, Michael Haberler wrote:
> Am 29.01.2012 um 06:37 schrieb Erik Christiansen:
> >> ----------------------
> >> $var1 = $foo + 1
> >> $var2 = 10
> >> 
> >> if $var1 < $var 2
> >>    ...
> >> else
> >>        ...
> >> endif
> >> ----------------------
> > 
> > Indeedy, but even the '$' is unnecessary.
> 
> I'm not sure whether thats possible in all cases.
> 
> There are several syntactic constructs - blocks, expressions,
> assignments and control structures.


> The noise in expressions really is only needed within blocks; in
> assignments, and expressions within control structure probably can be
> completely 'de-noised' and probably 'de-dollared' and partially
> 'de-hashed' too;).

To 'de-hash' our cleaner gcode, an alternative means of identifying
numbered parameters would be required. Otherwise they'd be
indistinguishable from simple integers.

My personal preference would be to use the '$' freed up from variable
names to identify numbered parameters, freeing '#' for the comment
delimiter, thus completely disambiguating func (foo + $100) # Comment.

> as Ken laid out, the tough decision is disambiguating within blocks by
> re-introducing whitespace as delimiter; or stay with using brackets to
> delimit expressions.

Perhaps I'm failing to imagine sufficiently pathological syntax cases,
but the only inescapable need for whitespace floating before my eyes is
two consecutive names, and I don't think that occurs in gcode. There
will always be either a parenthesis or operator in between, I suspect.

> I think its possible to:
> - make the bracketing requirement for expressions and assignments in
> control structure tests optional; 
> - drop the #<> requirement for named variables references and
> assignments in above; an introducer for numbered params is still
> needed
> - make the bracketing requirement *within* expressions optional. 
> - introduce more natural aliases for the EQ/NE/LT etc operators 

+1

> I'm unsure whether the () comment syntax can be disambiguated from
> normal function parameter lists like atan(param); this might need a
> backtracking parser or stay with brackets and there isnt much that
> gain.

I know of no other laguage which has a similar comment kludge.
We can also move up to the standard of having an unambiguous comment
delimiter. One way is described above. It has the merit of increased
consistency with scripting languages encountered by LinuxCNC
integrators, reducing language whiplash when we go from one to another.

> This is untested conjecture, but I think one could come up with a
> grammar which would still parse the current language and be able to
> write:

> sub func
> ;blablah
> endsub
> 
> baz = 0.45
> foo = atan[bar * 47.11] ; possible
> foo = atan(bar * 47.11) ; unsure - guess not; no much gain either

Language insularity does have an ongoing cost. I'd be interested to hear
whether the broad user base thinks the last line above looks more like a
function, than the preceding one.

Running with parallel distinct parsers, and a switch word in the input,
has the major advantage that there will be no regression in the old
parser due to the new, and grammar conflicts which might arise in a
combined parser are effortlessly avoided. Trying to be both Arthur and
Martha, at the one time, is usually much more difficult than settling
for alternation.

> #43 = foo / 10

While that could be 'de-hashed' without an alternative numbered
parameter identifier, I don't see how you'd propose to handle:

#43 = foo / #44

> if foo > bar
>    g10 x[foo * baz]
> else
>    g0 y[baz] z#43 x[func[10]] ; likely possible
>    g0 y[baz] z#43 x[func(10)] ; I guess not
> endif
> 
> that's less filling, but in effect requires bracketing expressions ins
> block, without requiring it elsewhere, which is a bit confusing. 

With the minor grammar change outlined above, we could have:

g0 y[baz] z$43 x[func(10)]   # I guess so.

To declutter further, this is readily parsable with a simple grammar:

g0 y baz; z $43; x func(10);   # Jeez, readable or wot?

would not necessitate a ' ' delimiter.
And it would be just as parsable if written:

g0 ybaz; z$43; xfunc(10);   # Not as readable, is it?

I might need a stateful lexer to do that easily, but they're simple
enough in lex. (So I often use them straight off.)

Oh, even the latter form would allow names which start with an axis
letter, e.g.:

g0 yyaz; z$43; xzunc(10);   # Not as readable, is it?

> It's one possible route. The other is the 'dont squash whitespace'
> route. That is a big decision since the latter isnt backwards
> compatible.

Still haven't seen a case which needs space delimiters.
(While I like whitespace for readability, I don't favour dependence on
it in the grammar, because humans can't always find the space bar.
"Be demanding in your output, but tolerant of your input." is a good
mantra for both a coder and (to a lesser extent) a grammar designer.)
...

> > Or mandate a "gcode+" keyword on the first line of input, to
> > allow either type at run-time?
> 
> the latter would break backwards compatibility - that keyword and the
> rest of the file in 'old rs274ngc' syntax wouldnt parse, at least not
> now and with old versions of linuxcnc.

Not in practice. ;-)
Any 'old rs274ngc' file will not have the new keyword, to upset an old parser.
Any 'new gcode' file will have the new keyword, and so will invoke the
new parser. Voila! No problem exists.

> > Manually, a custom regression testing framework, or move to DejaGnu?
> 
> IMO the next person to introduce another piece of TCL to linuxcnc
> should be damned to use Forth exclusively for the rest of her life.
> The primary benefit of TCL was to get John Ousterhout tenure at
> Berkeley, but then LSD and BSD came from there too, no coincidence IMO ;)

:-))

While I have brief experience with DejaGnu, on another OS project, I
don't prefer it to alternatives. In industry, I've tended to use simple
stimulus files (gcode snippets in our case) and "reference output" for
comparison with actual. Automated comparison is simple, and doesn't
require TCL. :-)

> That shouldnt prevent us to look for a more comprehensive regression
> testing framework. Testing GUI applications 'end to end' is lacking
> sorely, btw.

I don't know where 'GUI applications' comes into the picture.
I thought we are talking about a gcode parser.

> > If we could move to a BNF specification of our permissible grammar, then
> > the problem would diminish, I think.
> 
> An example of a flex/bison parser for something which might eventually
> resemble rs274ngc, in c++, plumbed into the linuxcnc build system
> (Submakefile) is in 

> http://git.mah.priv.at/gitweb/emc2-dev.git/shortlog/refs/heads/parser-v2-dev
> 
> Its fairly useless and defunct right now but if somebody wants to play
> based on an example which builds its a start.
> 
> The grammar in the fennic.net parser needs a lot of work, and the EBNF
> from the Tom Kramer paper too.

If the current discussion proceeds to a proposed specification of a new
syntax, and interest in a decluttered syntax is evident, then I'll take
an active interest, and begin to play. At least there is a little more
interest than on previous laps around this mulberry bush.

It's not hard to build up the grammar from scratch. It's just time
consuming to get it right.

OK folks, now is a good time to weigh in, either on the side of human
readable gcode, or in favour of the old cluttered kludge.

Erik

-- 
Programs must be written for people to read, and only incidentally for        
machines to execute.                            - Abelson and Sussman


------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Emc-users mailing list
Emc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/emc-users

Re: [Emc-users] question on gcode parsing

Reply via email to