RFC: documentation for typed mid-rule actions

Akim Demaille Sun, 12 Aug 2018 02:34:11 -0700

Hi all,

I believe that typed mid-rule actions are ready, except for
the lack of documentation.  Please find below a proposal of
documentation, currently in the ad/typed-midrule branch and
in PDF currently here:


https://www.lrde.epita.fr/~akim/private/bison/bison.pdf

I would also like to enforce consistency in Bison which uses
both midrule and mid-rule.  I am in favor of ‘midrule'
(shorter, consistent with our move to lookahead instead of
look-ahead, more consistent between code and doc as there is
no conversion from dash to underscore, etc.), but I am not
a native!  To which one should we stick? (FWIW, neither
appear in the manual of YACC, only ‘An action appearing in the
middle of a rule […]’ does).

I must have gotten rotten in Texinfo, as both my @ref to
{Typed Mid-Rule Actions} appear as
‘See ⟨undefined⟩ [Typed Mid-Rule Actions], page ⟨undefined⟩’
in PDF, and I have no idea why.  It doesn’t appear in the
table of contents either.  Info and HTML are fine, latest
gnulib.


Thanks in advance!


commit 962317f490ee018fd6f6177495c356e9effc92a3
Author: Akim Demaille <[email protected]>
Date:   Sun Aug 12 10:49:29 2018 +0200

    doc: typed mid-rule actions
    
    * doc/bison.texi (Mid-Rule Actions): Restructure to insert...
    (Typed Mid-Rule Actions): this new section.
    Move the manual translation of mid-rule actions into regular actions
    to...
    (Mid-Rule Action Translation): here.

diff --git a/doc/bison.texi b/doc/bison.texi
index 74ad9e5d..bad88cdb 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -225,6 +225,7 @@ Defining Language Semantics
 Actions in Mid-Rule
 
 * Using Mid-Rule Actions::       Putting an action in the middle of a rule.
+* Typed Mid-Rule Actions::       Specifying the semantic type of their values.
 * Mid-Rule Action Translation::  How mid-rule actions are actually processed.
 * Mid-Rule Conflicts::           Mid-rule actions can cause conflicts.
 
@@ -4071,6 +4072,7 @@ are executed before the parser even recognizes the 
following components.
 
 @menu
 * Using Mid-Rule Actions::       Putting an action in the middle of a rule.
+* Typed Mid-Rule Actions::       Specifying the semantic type of their values.
 * Mid-Rule Action Translation::  How mid-rule actions are actually processed.
 * Mid-Rule Conflicts::           Mid-rule actions can cause conflicts.
 @end menu
@@ -4158,64 +4160,86 @@ earlier action is used to restore the prior list of 
variables.  This
 removes the temporary @code{let}-variable from the list so that it won't
 appear to exist while the rest of the program is parsed.
 
+Because the types of the semantic values of mid-rule actions are unknown to
+Bison, type-based features (e.g., @samp{%printer}, @samp{%destructor}) do
+not work, which could result in memory leaks.  They also forbid the use of
+the @code{variant} implementation of the @code{api.value.type} in C++
+(@pxref{C++ Variants}).
+
+@xref{Typed Mid-Rule Actions}, for one way to address this issue, and
+@ref{Mid-Rule Action Translation}, for another: turning mid-action actions
+into regular actions.
+
+
+@node Typed Mid-Rule Actions
+@subsubsection Typed Mid-Rule Actions
+
 @findex %destructor
 @cindex discarded symbols, mid-rule actions
 @cindex error recovery, mid-rule actions
 In the above example, if the parser initiates error recovery (@pxref{Error
 Recovery}) while parsing the tokens in the embedded statement @code{stmt},
 it might discard the previous semantic context @code{$<context>5} without
-restoring it.
-Thus, @code{$<context>5} needs a destructor (@pxref{Destructor Decl, , Freeing
-Discarded Symbols}).
-However, Bison currently provides no means to declare a destructor specific to
-a particular mid-rule action's semantic value.
-
-One solution is to bury the mid-rule action inside a nonterminal symbol and to
-declare a destructor for that symbol:
+restoring it.  Thus, @code{$<context>5} needs a destructor
+(@pxref{Destructor Decl, , Freeing Discarded Symbols}), and Bison needs the
+type of the semantic value (@code{context}) to select the right destructor.
 
-@example
-@group
-%type <context> let
-%destructor @{ pop_context ($$); @} let
-@end group
+As an extension to Yacc's mid-rule actions, Bison offers a means to type
+their semantic value: specify its type tag (@samp{<...>} before the mid-rule
+action.
 
-%%
+Consider the previous example, with an untyped mid-rule action:
 
+@example
 @group
 stmt:
-  let stmt
+  "let" '(' var ')'
     @{
-      $$ = $2;
-      pop_context ($let);
-    @};
+      $<context>$ = push_context (); // ***
+      declare_variable ($3);
+    @}
+  stmt
+    @{
+      $$ = $6;
+      pop_context ($<context>5);     // ***
+    @}
 @end group
+@end example
+
+@noindent
+If instead you write:
 
+@example
 @group
-let:
+stmt:
   "let" '(' var ')'
-    @{
-      $let = push_context ();
+    <context>@{                      // ***
+      $$ = push_context ();          // ***
       declare_variable ($3);
-    @};
-
+    @}
+  stmt
+    @{
+      $$ = $6;
+      pop_context ($5);              // ***
+    @}
 @end group
 @end example
 
 @noindent
-Note that the action is now at the end of its rule.
-Any mid-rule action can be converted to an end-of-rule action in this way, and
-this is what Bison actually does to implement mid-rule actions.
+then @code{%printer}, and @code{%destructor} work properly (no more leaks!),
+C++ @code{variant}s can be used, and redundancy is reduced (@code{<context>}
+is specified once).
+
 
 @node Mid-Rule Action Translation
 @subsubsection Mid-Rule Action Translation
 @vindex $@@@var{n}
 @vindex @@@var{n}
 
-As hinted earlier, mid-rule actions are actually transformed into regular
-rules and actions.  The various reports generated by Bison (textual,
-graphical, etc., see @ref{Understanding, , Understanding Your Parser})
-reveal this translation, best explained by means of an example.  The
-following rule:
+Mid-rule actions are actually transformed into regular rules and actions.
+The various reports generated by Bison (textual, graphical, etc., see
+@ref{Understanding, , Understanding Your Parser}) reveal this translation,
+best explained by means of an example.  The following rule:
 
 @example
 exp: @{ a(); @} "b" @{ c(); @} @{ d(); @} "e" @{ f(); @};
@@ -4273,6 +4297,45 @@ mid.y:2.19-31: warning: unused value: $3
 @end group
 @end example
 
+@sp 1
+
+It is sometimes useful to turn mid-rule actions into regular actions, e.g.,
+to factor them, or to escape from their limitations.  For instance, as an
+alternative to @emph{typed} mid-rule action, you may bury the mid-rule
+action inside a nonterminal symbol and to declare a printer and a destructor
+for that symbol:
+
+@example
+@group
+%type <context> let
+%destructor @{ pop_context ($$); @} let
+%printer @{ print_context (yyo, $$); @} let
+@end group
+
+%%
+
+@group
+stmt:
+  let stmt
+    @{
+      $$ = $2;
+      pop_context ($let);
+    @};
+@end group
+
+@group
+let:
+  "let" '(' var ')'
+    @{
+      $let = push_context ();
+      declare_variable ($var);
+    @};
+
+@end group
+@end example
+
+
+
 
 @node Mid-Rule Conflicts
 @subsubsection Conflicts due to Mid-Rule Actions
@@ -10523,7 +10586,7 @@ To enable variant-based semantic values, set 
@code{%define} variable
 @code{%union} is ignored, and instead of using the name of the fields of the
 @code{%union} to ``type'' the symbols, use genuine types.
 
-For instance, instead of
+For instance, instead of:
 
 @example
 %union
@@ -10536,7 +10599,7 @@ For instance, instead of
 @end example
 
 @noindent
-write
+write:
 
 @example
 %token <int> NUMBER;
@@ -10555,7 +10618,10 @@ Variants are stricter than unions.  When based on 
unions, you may play any
 dirty game with @code{yylval}, say storing an @code{int}, reading a
 @code{char*}, and then storing a @code{double} in it.  This is no longer
 possible with variants: they must be initialized, then assigned to, and
-eventually, destroyed.
+eventually, destroyed.  As a matter of fact, Bison variants forbid the use
+of alternative types such as @samp{$<int>2} or @samp{$<std::string>$}, even
+in mid-rule actions.  It is mandatory to use typed mid-rule actions
+(@pxref{Typed Mid-Rule Actions}).
 
 @deftypemethod {semantic_type} {T&} build<T> ()
 Initialize, but leave empty.  Returns the address where the actual value may
@@ -10575,10 +10641,13 @@ Boost.Variant not only stores the value, but also a 
tag specifying its
 type.  But the parser already ``knows'' the type of the semantic value, so
 that would be duplicating the information.
 
+We do not use C++17's @code{std::variant} either: we want to support all the
+C++ standards, and of course @code{std::variant} also stores a tag to record
+the current type.
+
 Therefore we developed light-weight variants whose type tag is external (so
-they are really like @code{unions} for C++ actually).  But our code is much
-less mature that Boost.Variant.  So there is a number of limitations in
-(the current implementation of) variants:
+they are really like @code{unions} for C++ actually).  There is a number of
+limitations in (the current implementation of) variants:
 @itemize
 @item
 Alignment must be enforced: values should be aligned in memory according to
@@ -10588,6 +10657,9 @@ therefore, since, as far as we know, @code{double} is 
the most demanding
 type on all platforms, alignments are enforced for @code{double} whatever
 types are actually used.  This may waste space in some cases.
 
+@item
+Move semantics is not yet supported, but will soon be added.
+
 @item
 There might be portability issues we are not aware of.
 @end itemize

RFC: documentation for typed mid-rule actions

Reply via email to