Hi Siddhesh,

Thanks for these insights! :-)

Some more to add to the list:

- The module libs.antlr.cnd is the donated version of cnd.antlr in NetBeans 8.2, which was a fork of antlr2. antlr2 is a problem from a licensing point of view, as Terence Parr (antlr's main author) states at [1], and we should be using antlr 3.4 and onwards. The fork contained some interesting hacks, gone with the wind now.

- cnd.apt has a _lexer_ for C/C++/Fortran, and a _grammar_ for the C preprocessor. This makes it possible for CND to understand #defines and #ifdef #ifndef. These are the *.g files at [2], written in antlr2. I'm currently trying to migrate them to antlr3. Looks promising.

- cnd.apt compiles these grammar files and then generates an APTTokenTypes.txt file [3] and its Java counterpart.

- cnd.modelimpl uses this file (at compile time) as the _lexer_ for the CXXParser.g3 grammar [4].

- clang seems indeed the way to go, but may need some research. It seems clang has a stable C API that we could use, called "libclang". Some people say libclang falls short in AST, and have created some Apache licenses libraries for enhancing the AST. These are now being used in the Embarcadero IDE for C/C++ [5]


For the time being I think I'll continue porting antlr2 to antlr3 grammars for cnd.apt (possibly recovering cnd.antlr with antlr3), and then try to use them elsewhere. Will report in a few days.

All research about clang alternatives is indeed welcome.

Thanks Siddhesh!

Kind regards,
Antonio



[1]
"Because the v2 license was unclean, projects such eclipse could not include ANTLR v3. This version, 3.4, is completely BSD clean and all additions were subject to the click wrap license or the ANTLR contributor's certificate of origin."

https://theantlrguy.atlassian.net/wiki/spaces/ANTLR3/pages/2687376/ANTLR+3.4+Release+Notes

[2]
https://github.com/apache/netbeans/tree/cnd/cnd/cnd.apt/src/org/netbeans/modules/cnd/apt/impl/support

[3]
cnd.apt generating APTTokenTypes.txt at build time.

https://github.com/apache/netbeans/blob/f9b7c264647c39127e82b756dcdbb9752323d2ad/cnd/cnd.apt/build.xml#L71


[4]
cnd.modelimpl's CXXParser.g using cnd.apt's APTTokenTypes at build time.

https://github.com/apache/netbeans/blob/f9b7c264647c39127e82b756dcdbb9752323d2ad/cnd/cnd.modelimpl/src/org/netbeans/modules/cnd/modelimpl/parser/CXXParser.g#L28

[5]
https://foonathan.net/2017/04/cppast/
https://github.com/foonathan/cppast
https://www.embarcadero.com/de/products/cbuilder/starter

On 18/04/2021 2:03, Siddhesh Rane wrote:
I did some dependecy analysis on the cnd modules and here is what I found.

* The current Antlr based parsers and their generated ASTs are not directly 
used by any module in cnd (such as Code Completion, Navigator etc). Instead all 
modules depend on an AST-like code model specified in C/C++ Code Model API 
(cnd.api.model)

* The Code Model API has an implementation in cnd.modelimpl which contains the 
parsers and antlr grammar files. Internally, this module converts the parser's 
generated AST into the Code Model API.

* There is clean code seperation between code model impl and api; Only API 
module is used in rest of the cnd modules without knowledge of the 
implementation details (cnd.modelimpl is only used for test cases; cnd.modelui 
had dependency but it appears in some package related to tracing so I assume 
its debug).

* The other module containing an Antlr file, Abstract Preprocessor Tree 
(cnd.apt) is only a dependency for cnd.modelimpl.
cnd.completion has it listed as a dependency but no java file imports anything 
from that module. cnd.modelui again uses it only in a trace package. So cnd.apt 
is related only to model implementation.

* There are some modules with name "clank" in them. These can be completely 
removed because by default they are disabled using system flags. It seems to be an 
experimental clang integration that was never used.

Based on these observations, I think we can safely get rid of all Antlr files 
by targeting the Code Model Implementation module.
My suggested plan of action would be:

1. Introduce a clang based parser alongside the existing parsers. Look at class 
org.netbeans.modules.cnd.modelimpl.parser.ParserProviderImpl. This currently 
contains two C++ parsers: Antlr2CppParser and Antlr3CXXParser, with the antlr 2 
being used by default. The parsers have a parse method which does parsing into 
custom AST and a render method where custom AST is used to create Code Model 
API objects. The render method is where we will need mapping between clang AST 
to code model api. ( I am trying to find good tutorials about clang AST)

2. Once we test the clang parser, we can focus on cnd.apt. This package does 
have some clang translation support with stuff such as compilation db, so 
eventually it can be completely removed if successfully transitioned.

3. Finally a lot of code can be simplified. Particularly, the indexing is 
happening at the same time as parsing. This needs to be moved over to the 
Indexing API like it is done for java.

I hope that with this approach we can keep up with latest C++ language releases 
while not having to sacrifice any of the functionality in the cnd module.

Siddhesh
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists




---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists



Reply via email to