Hi Akim, On Wed, 21 Nov 2018 17:57:10 +0100, Akim Demaille wrote: > Please, stick to our style for the commit messages. Look at a few examples, > and imitate.
I will try to do my very best next time. :) > I installed the first as follows. > > commit 67d77493fd16de3472c6f09b59f18946b83dfbe0 > Author: Jannick <[email protected]> > Date: Tue Nov 20 22:00:43 2018 +0100 > > doc: calc++: ignore \r in the scaner (minor: a typo.) > * doc/bison.texi (Calc++ Scanner): Ignore \r. > > diff --git a/doc/bison.texi b/doc/bison.texi index 01d7c8dc..371cec84 100644 > --- a/doc/bison.texi > +++ b/doc/bison.texi > @@ -11897,7 +11897,7 @@ Abbreviations allow for more readable rules. > @example > id [a-zA-Z][a-zA-Z_0-9]* > int [0-9]+ > -blank [ \t] > +blank [ \t\r] > @end example > > @noindent This code snip appears not only in the sample code files, but in bison's documentation, too. So, for didactic reasons, I would be hesitating to put '\r' into a regexp named 'blank', albeit ultimately no difference to the scanner, of course. The most natural way I wanted to handle this without changing to much would have been to define a regexp dealing with any kind of line break, however, that collides with the line counting [\n]+ loc.lines (yyleng); This was the reason I put '\r' into a separate line to simply ignore them. Thinking about that a bit more and generalizing this for probably almost all operating systems out there, I believe /(\r\n|\n\r|\n|\r)/ would be a good pattern for a line break to start off with and the line counter must be adjusted to. Suggestion (using flex'es greediness): ((\r\n)+|(\n\r)+) loc.lines (yyleng/2); (\n+|\r+) loc.lines (yyleng); This would work even for mixed EOL files (with both '\n' and '\r\n') on my Windows machine, where I have never seen a file with both '\n' and '\r' as single EOL character. Not tested. > I think the third one is wrong. yylex_destroy is for reentrant scanners, it’s > releasing what yylex_init acquired. 'yylex_destroy' exists for both, reentrant and non-reentrant scanners, and calling it after having used the scanner is required to avoid memory leakage in both cases. See below. > We don’t use a reentrant scanner, and we don’t call yylex_init, so I don’t > think we should call yylex_destroy.= This is a probably not so well documented issue for __non-reentrant__ flex scanners: flex allocates memory for the stack object 'yy_buffer_stack' every buffer struct is pushed onto. For reentrant scanners the stack is a member of the yyscanner object, for non-reentrant ones it is a global variable. Regardless of the scanner type 'yyensure_buffer_stack' (re)allocates memory for the stack object to make sure that it exists and the stack has sufficient size. This happens, e.g., when 'yylex' is called the first time. If, for a non-reentrant scanner, we do not call 'yylex_destroy' at the end of the program, there is memory leakage. Same applies to reentrant scanners which is more known. So, still my suggestion for the non-reentrant scanner calc++ to install a patch making sure that 'yylex_destroy' is called at the end. BTW: I believe that the same issue applies to bison's three non-reentrant scanners. If memory leakage is important to bsion, then probably a good idea to add, say at the end of function 'reader()', something like gram_lex_destroy(); code_lex_destroy(); and at the end of 'scan_skel()' skel_lex_destroy(); Thanks, J.
