Not necessarily relevant but marpa_v_step() can return a negative number on failure, and I think this is not handled. In particular there are a number of cases of unexpected returns from marpa_v_step() which you handle via fall-through which probably are better treated as fatal errors.
On Thu, Aug 17, 2017 at 6:55 PM, Andreas Kupries <[email protected]> wrote: > > > Looked some more at `rule` and `rslot` and I not 100% sure it's wrong, > but > > I can't convince myself it's right. > > I actually cannot test that this part is working and correct, because > according to my trace the `case MARPA_MARPA_STEP_RULE` is never > entered at all, because `marpa_v_step` is somehow not providing this > step type, nor any of the other types outside of token and inactive. > > Example traces attached, for lexer and bocage. The `rule 65535` in the > lexer is because the code never goes through the rule case, leaving > the variable uninitialized as ((unsigned short) -1). > > Both traces span bytes (ascii chars) 0-63 of the meta grammar, the > initial hash comment. The two underlying grammars are slighty different > because > the Tcl engine operates on (unicode) characters and handles > charclasses through the Tcl regex (as much as possible), and the > C-level engines operates on bytes with the UTF-8 strings and > charclasses of the original grammar fully deconstructed into sequences > of bytes and byte alternatives (See [2]). > > Regardless, what I am trying to do with `rslot` is to get the > next-to-last rule application, which due to how the lexer grammar is > set up (see below) should be the rule for L [1] (while the last rule > in the steps is for @L0START [0] itself). > > I.e. for a lexeme L, and its ACS (*) A(L) > > [0] @L0START := A(L) L > [1] L := ... > > (*) Acceptability Control Symbol (for LATM). LTM symbols are in the > 'always' section whose ACS is always entered as initial > alternative after lexer start_input. The LATM symbols have their > ACS entered only when the parser signals them as acceptable. > > The 'rslot' array is a history ring buffer of two elements. At the > end of the loop the last rule entered is [0] and the ring-index points > to the next element to fill, currently containing the next-to-last > rule id. > > [2] https://core.tcl.tk/akupries/marpa/artifact/ > 7597d1df8f52e366?ln=515-523 > https://core.tcl.tk/akupries/marpa/artifact/ > 7597d1df8f52e366?ln=546-551 > > The two below compile a charclass from set of char-ranges into > alternation of byte sequences, in C and Tcl > > https://core.tcl.tk/akupries/marpa/artifact/51c800180911b96c > https://core.tcl.tk/akupries/marpa/artifact/da0c004acee1e326?ln=145 ff > > > On Thu, Aug 17, 2017 at 3:23 PM, Jeffrey Kegler < > > [email protected]> wrote: > > > > > Sorry to take so long. > > > > > > Ambitious stuff! > > Also at the beginning. No handling of the low-level marpa events, no > to speak of handling SLIF event, ruby slippers and the like. > Just the core lexer/parser dual engine for now > > > > Nothing obvious. If it were my code, I'd first double-check the `rule` > > > and `rslot` logic, though nothing about it is obviously wrong. > > -- > See you, > Andreas Kupries <[email protected]> > <http://core.tcl.tk/akupries/> > Developer @ SUSE (MicroFocus Canada LLC) > <[email protected]> > > Tcl'2017, Oct 16-20, Houston, TX, USA. http://www.tcl.tk/community/ > tcl2017/ > ------------------------------------------------------------ > ------------------- > > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
