Re: Please, support easy AST generation

2018-12-20 Thread Yijun . Yu
Maybe my comment is rather late. Forgive me if it is unrelated to the current 
discussions.

I have been doing the AST generation from Bison for some time since the YAXX 
subproject.
There the initial aim was to output the parsing trees as XML and some people 
liked that.
But occasionally I received questions on how to do this for ASTs. At that time 
I was not
Bison maintainer so the approach I took was to do this as a change to the 
“yacc.c” template
and keep everything else the same.

The actions to decide whether a parsing tree node (or I call it element in the 
XML counterpart)
is worthy to keep or not. If they are not, I simply do no generate the tags, 
similar to what you
propose for conditionally prints.

The nice thing of this approach is it is lightweight in terms of maintenance 
effort, if all you
want to print the AST. What’s needed for the YAXX project was to create an XSLT 
spreadsheet
that reads in a “dictionary” of action names, then use the condition to filter 
out those tags, in
less than 20 lines of code.

But maybe supporting it natively on Bison generated template is better. Before 
deciding on this
effort, do you want to take a look at the yaxx branch below as a starting 
point. 

http://git.savannah.gnu.org/cgit/bison.git/log/?h=yaxx

Cheers — Yijun

> On 9 Dec 2018, at 06:47, Frank Heckenbach  wrote:
> 
> Askar Safin wrote:
> 
>> Hi. Often the only thing I want to do in Bison is just generate
>> AST and nothing else. Unfortunately, in this case code becomes
>> very repetitive. For example, this is Bison input I used to
>> generate AST for small JavaScript subset:
>> https://zerobin.net/?b9af68c9aa7a31a9#JB4wZseYTq9aKfZOwwZrDqBNCqlEZRj/+DM9bKdgtKU=
>> . Please, support some feature, which eliminates the need to write
>> such repetitive code. Such code can be easily generated from
>> grammar alone
> 
> Indeed, I had discussed something similar with Akim Demaille
> privately some time ago; he also mentioned it on this list a few
> weeks ago (I guess I'm one of the "some people" he refers to, but
> unfortunately I was too busy then to get involved):
> 
> : Having a means to ask Bison to generate actions given some form of
> : a generic pattern is a different topic.  It makes a lot of sense,
> : and some people have discussed about this on various occasions
> : here.  That would help to bind to some abstract factory for ASTs
> : for instance, without forcing a specific AST format.
> 
> So let me restate and expand upon my suggestion here, which is,
> of course, just a rough sketch. This refers mainly to the C++ output
> as it makes heavy use of templates and overloading. Maybe something
> similar can be done in Java etc., but I'll leave that to the
> respective experts. In C, I think this will only work in a less
> general way which requires more effort by the end-user.
> 
> Actually, this would not only mostly automatically generate actions
> to generate ASTs and the like, but also some normal actions in my
> current grammars that don't primarily build ASTs, but contain
> actions that build "mini sub-ASTs", i.e. structures that are later
> acted upon in other actions, e.g. parameter lists while parsing a
> function declaration or call. I guess such actions might be common
> in many grammars.
> 
> AFAIK, Bison's current behaviour applied to any rule can be summed
> up like this (I hope I'm not missing important details):
> 
> 1. If a user-specified action is given, output this action and
>   nothing else; otherwise:
> 
> 2. if $$ has no declared type, do nothing; otherwise:
> 
> 3. if there are no RHS symbols, default-initialize $$ (when Bison
>   knows how to, esp. when using variants); otherwise:
> 
> 4. output the default action "$$ = $1;" (possibly with auto-move),
>   warning if the declared types of $$ and types are different
>   (which may still result in a successful compilation if the types
>   are compatible, or a compiler-error down the line if not).
> 
>   As a side note, though I suggest to introduce this default action
>   into C++, I'm not sure if it's actually useful to apply it (in
>   any language) when there's more than one RHS argument (or at
>   least, more than ne typed one, more about this below). But if
>   that's required by POSIX, we'd have no choice, at least in C.
> 
> My suggestion is basically to insert another step (optional, of
> course) to auto-generate actions just based on the types involved
> before 3., or even before 2. -- the latter could be useful to
> generate actions that consume parsed data (e.g. to compile a
> function or to store a declaration in an interpreter) rather than
> build structures such as ASTs, e.g.:
> 
> 1.5: if an auto-action for the types of $$ and $n can be generated,
> do so; otherwise: ...
> 
> So how can such an auto-action be specified?
> A rather general way may use a form similar to %printer, e.g.:
> 
>  %auto-action { build_foo ($$, $^); } ;
> 
> Then any rule where $$ is of type foo and which has no explicit
> 

Re: Enhancement request: enabling Variant in C parsers

2018-10-30 Thread Yijun . Yu
Hi Victor,

I agree with Akim, that currently we need to maintain bison’s current design to 
support all the use cases it support.

There was an attempt I created a while ago to generate AST representation in 
XML out of bison parsing.

If what you are looking for is to simplify the way the ASTs are handled, it 
might be worthy to take a look and see if
it fits your purposes and could separate the concerns.
```
git clone https://git.savannah.gnu.org/git/bison.git
git pull origin yaxx
```

The branch is slightly outdated, I will test it again if you experience any 
problems.

Best regards,
Yijun

On 30 Oct 2018, at 10:10, Victor Khomenko 
mailto:victor.khome...@newcastle.ac.uk>> wrote:

Hi Akim,

Re flex/bison, ANTLR, and racing cars:

I think bison has a number of cool features, in particular nice error handling, 
support for full LR(1), and glr. They definitely give it an edge over other 
parser generators.

Where it fails: Mundane things like clunky interface with a scanner, too many 
includes - so too many build dependencies.

The latter is partially related to the scanner interface, e.g. if the scanner 
were integrated, there would be no need to generate parser.h in many cases, 
i.e. one could manage with a single generated file parser.c[pp].

It would be nice to have some stats about how parser generators (not just 
bison) are used (maybe you have it). My speculation is that it's mostly *not* 
about programming languages. Most of my parsers are for simple expressions 
(every now and then there is some legacy pre-XML format that is mostly regular 
but has fields containing expressions). In such use-cases, the mundane things 
prevail and people will increasingly choose e.g. ANTLR for new projects. Ok, 
maybe they would still choose bison for racing cars (i.e. programming 
languages).

I'm not sure what are the future plans for bison, but I hope it has not quite 
reached that stage when one declares that it has done its service to the 
community and it's time to retire and give way to the younger generation... So 
I'd still consider the possibility of integrating a scanner generator into 
bison, maybe a severely cut-down version of flex, without any fancies like 
REJECT, etc. Essentially, it should be possible to build an equivalent of 
calc++ with only calc++.y and generated calc++.cpp, without any other files. 
I'd vote for this as the most desirable feature. I realise it's much work, but 
I believe without this bison will eventually lose to ANTRL.

Cheers,
Victor.

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt 
charity in England & Wales and a charity registered in Scotland (SC 038302). 
The Open University is authorised and regulated by the Financial Conduct 
Authority in relation to its secondary activity of credit broking.