Hi Nikhil,

Thanks for the answer. If I understand your points correctly, your main points 
are:

* Nobody would want to rewrite the parser
* Autolev language is simple enough, so AST is not worth the additional 
complexity / code
* Converting from CST to AST is not trivial in this case

I thought about these questions a lot and my view is the following:

Regarding the first point, people definitely might want to try some other 
parsers, especially since Autolev is such a simple language. I know for example 
that Aaron was not 100% convinced that ANTLR is the best way. One disadvantage 
of ANTLR is that it produces very large parser files. For example even for your 
simple grammar, the parser has almost 3000 lines 
(https://github.com/NikhilPappu/autolev-parser/blob/f6d110ff5a8cef3aea4f6357f343491572350393/AutolevParser.py)
 and such big files are not super easy to deal with, for example they grow the 
sympy git repository a lot, so we might not want to have those checked in. But 
then people require Java and ANTLR to generate those. And if it is possible to 
write a simpler parser, for example with pyparsing or something similar, then 
perhaps in the future we might want to go this route. Note that I personally 
like and recommend ANTLR, as it gets the job done pretty easily and cleanly, I 
really like that it keeps the grammar g4 files separate from the code.

Regarding the second point, If you look at your parser, you have to handle 
variables and some other stuff, and it is all tied up to how exactly ANTLR 
represents the nodes, as defined in your g4 file. So if I want to polish / 
change the g4 file, I have to understand the rest of your compiler. That's 
actually not so easy. As a particular example, this line 
https://github.com/NikhilPappu/autolev-parser/blob/f6d110ff5a8cef3aea4f6357f343491572350393/myListener.py#L205
 depends on how exactly you write the rule in the g4 file, so that the 2nd 
Child is what you expect it to be. If you rework the g4 file, this will break. 
And since this is used all over the compiler, it's not easy for me to quickly 
figure out what needs to be changed and checked. If it used AST, then I just 
have to make sure my changes to the g4 file generate exactly the same AST, and 
don't have to worry about the rest. In practice, I just have to go to the 
visitor methods that deal with the old g4 rules, and modify those for the new 
g4 rules, to generate the same AST. So the AST approach is actually simpler, 
from this perspective.

Regarding the third point, I don't quite understand what you meant, as the 
point of AST is that you lose syntax information that you don't need. If there 
is some information that you need, it needs to be included in AST.

For now, I suggest you keep doing what you are doing, but I will try to send a 
PR with the initial stub for an AST, for this problem, so that you get an idea 
how it all works.

No matter however, whether AST is used or not, we will need thorough tests for 
the grammar, essentially every rule should be tested by tests. So that if we 
switch to AST later, we can be sure that we didn't break anything.

Ondrej

On Fri, May 18, 2018, at 10:39 AM, Nikhil Pappu wrote:
> 
> 
> On Friday, May 18, 2018 at 4:08:29 PM UTC+5:30, Ondřej Čertík wrote:
> >
> >
> >
> > On Fri, May 18, 2018, at 1:32 AM, Nikhil Pappu wrote: 
> > > Ondřej, Jason, 
> > > 
> > > I have written my first blog post <https://wp.me/p3Sr5x-2> discussing 
> > the 
> > > project details and status. 
> > > Can you please go over it and provide feedback? 
> >
> > I think overall it looks very good. It looks like you got a good handle of 
> > ANTLR4. 
> >
> > I noticed you are using a listener: 
> >
>  
> I prefer using listeners over visitors. I feel it helps me to focus on the 
> logic than on the tree traversal. 
> 
> >
> > One thing to consider is whether to directly use the grammar from ANTLR, 
> > which is called a concrete syntax tree (CST), and construct a sympy 
> > expression of whatever you want out of it directly. That is what you do now 
> > I think. 
> >
>  
> Yes, I am using the CST approach.
> 
> 
> > The alternative is to first construct an abstract syntax tree (AST) from 
> > the CST. That way the AST will be the input to the rest of your "compiler", 
> > and it won't matter that we currently use ANTLR to construct it.
> 
>  
> I only have a basic idea of ASTs. I have never implemented them so I do not 
> have any experience using them.
> The ANTLR book showcased only the CST approach so I am much more 
> comfortable with that.
> I think ANTLR generates something in between a CST and an AST but it does 
> have a lot of clutter compared to an AST. 
> I was just looking at some examples of using ASTs with ANTLR in 
> stackoverflow but most of the examples were quite trivial.
> The conversion from CST to AST for the Autolev grammar wouldn't be so 
> trivial if we want to obtain a transformation with minimal loss.
> Also, I would need to change the parser code quite significantly to deal 
> with the new tree. 



> 
> 
> Later somebody can decide to write a hand parser, or some other tool. As 
> > long as it produces the AST, that's all that matters. 
> 
> 
> I'm not sure why someone would want to do that if we already have a working 
> parser.
> One might rather look towards changing the actual parser code itself in my 
> opinion, whether to fix any issues or add new functionality.
>  
> 
> > As it is currently, the CST depends on the exact rules as you write them 
> > in the ANTLR grammar, and everytime you change the rules, you have to 
> > rework your compiler. While if you use AST, then you only have to modify 
> > the code that converts the (modified) CST to AST, but that's it. The rest 
> > of the compiler uses AST, and so it doesn't need to change. 
> 
> 
> That is an advantage but I don't think we would have to change the grammar 
> all that much. I have tested the grammar on a lot of inputs and it seems to 
> parse them just fine.
>  
> 
> > Python, for example, uses this approach. It parses Python code into an 
> > AST, and so they are free to change the parser any way they like ---- our 
> > SymPy code that uses the Python AST doesn't need to change. 
> 
>  
> I think that the AST approach would be a much better one in the case of 
> more complex languages like Python or C.
> I think this is not as important in this case as Autolev is a much simpler 
> language. 
> At its heart it mainly only consists or declarations, assignments and 
> commands which makes direct parsing using a CST convenient enough.
> 
> >
> >
> > > Also, when will you be available to discuss things? 
> >
> > I am available over email only the next week. 
> >
> > Ondrej 
> >
> > > 
> > > -- 
> > > You received this message because you are subscribed to the Google 
> > > Groups "sympy" group. 
> > > To unsubscribe from this group and stop receiving emails from it, send 
> > > an email to sympy+un...@googlegroups.com <javascript:>. 
> > > To post to this group, send email to sy...@googlegroups.com 
> > <javascript:>. 
> > > Visit this group at https://groups.google.com/group/sympy. 
> > > To view this discussion on the web visit 
> > > 
> > https://groups.google.com/d/msgid/sympy/bc81cd14-ad9b-4503-a639-aa6f673f4c51%40googlegroups.com.
> >  
> >
> > > For more options, visit https://groups.google.com/d/optout. 
> >
> 
> -- 
> You received this message because you are subscribed to the Google 
> Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to sympy+unsubscr...@googlegroups.com.
> To post to this group, send email to sympy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sympy.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sympy/d207a582-c67e-4219-8a6f-1d5c6a640ebe%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To post to this group, send email to sympy@googlegroups.com.
Visit this group at https://groups.google.com/group/sympy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/1526929667.1412532.1379717184.329462E1%40webmail.messagingengine.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to