Hi Siddhesh,
Thanks for these insights! :-)
Some more to add to the list:
- The module libs.antlr.cnd is the donated version of cnd.antlr in
NetBeans 8.2, which was a fork of antlr2. antlr2 is a problem from a
licensing point of view, as Terence Parr (antlr's main author) states at
[1], and we should be using antlr 3.4 and onwards. The fork contained
some interesting hacks, gone with the wind now.
- cnd.apt has a _lexer_ for C/C++/Fortran, and a _grammar_ for the C
preprocessor. This makes it possible for CND to understand #defines and
#ifdef #ifndef. These are the *.g files at [2], written in antlr2. I'm
currently trying to migrate them to antlr3. Looks promising.
- cnd.apt compiles these grammar files and then generates an
APTTokenTypes.txt file [3] and its Java counterpart.
- cnd.modelimpl uses this file (at compile time) as the _lexer_ for the
CXXParser.g3 grammar [4].
- clang seems indeed the way to go, but may need some research. It seems
clang has a stable C API that we could use, called "libclang". Some
people say libclang falls short in AST, and have created some Apache
licenses libraries for enhancing the AST. These are now being used in
the Embarcadero IDE for C/C++ [5]
For the time being I think I'll continue porting antlr2 to antlr3
grammars for cnd.apt (possibly recovering cnd.antlr with antlr3), and
then try to use them elsewhere. Will report in a few days.
All research about clang alternatives is indeed welcome.
Thanks Siddhesh!
Kind regards,
Antonio
[1]
"Because the v2 license was unclean, projects such eclipse could not
include ANTLR v3. This version, 3.4, is completely BSD clean and all
additions were subject to the click wrap license or the ANTLR
contributor's certificate of origin."
https://theantlrguy.atlassian.net/wiki/spaces/ANTLR3/pages/2687376/ANTLR+3.4+Release+Notes
[2]
https://github.com/apache/netbeans/tree/cnd/cnd/cnd.apt/src/org/netbeans/modules/cnd/apt/impl/support
[3]
cnd.apt generating APTTokenTypes.txt at build time.
https://github.com/apache/netbeans/blob/f9b7c264647c39127e82b756dcdbb9752323d2ad/cnd/cnd.apt/build.xml#L71
[4]
cnd.modelimpl's CXXParser.g using cnd.apt's APTTokenTypes at build time.
https://github.com/apache/netbeans/blob/f9b7c264647c39127e82b756dcdbb9752323d2ad/cnd/cnd.modelimpl/src/org/netbeans/modules/cnd/modelimpl/parser/CXXParser.g#L28
[5]
https://foonathan.net/2017/04/cppast/
https://github.com/foonathan/cppast
https://www.embarcadero.com/de/products/cbuilder/starter
On 18/04/2021 2:03, Siddhesh Rane wrote:
I did some dependecy analysis on the cnd modules and here is what I found.
* The current Antlr based parsers and their generated ASTs are not directly
used by any module in cnd (such as Code Completion, Navigator etc). Instead all
modules depend on an AST-like code model specified in C/C++ Code Model API
(cnd.api.model)
* The Code Model API has an implementation in cnd.modelimpl which contains the
parsers and antlr grammar files. Internally, this module converts the parser's
generated AST into the Code Model API.
* There is clean code seperation between code model impl and api; Only API
module is used in rest of the cnd modules without knowledge of the
implementation details (cnd.modelimpl is only used for test cases; cnd.modelui
had dependency but it appears in some package related to tracing so I assume
its debug).
* The other module containing an Antlr file, Abstract Preprocessor Tree
(cnd.apt) is only a dependency for cnd.modelimpl.
cnd.completion has it listed as a dependency but no java file imports anything
from that module. cnd.modelui again uses it only in a trace package. So cnd.apt
is related only to model implementation.
* There are some modules with name "clank" in them. These can be completely
removed because by default they are disabled using system flags. It seems to be an
experimental clang integration that was never used.
Based on these observations, I think we can safely get rid of all Antlr files
by targeting the Code Model Implementation module.
My suggested plan of action would be:
1. Introduce a clang based parser alongside the existing parsers. Look at class
org.netbeans.modules.cnd.modelimpl.parser.ParserProviderImpl. This currently
contains two C++ parsers: Antlr2CppParser and Antlr3CXXParser, with the antlr 2
being used by default. The parsers have a parse method which does parsing into
custom AST and a render method where custom AST is used to create Code Model
API objects. The render method is where we will need mapping between clang AST
to code model api. ( I am trying to find good tutorials about clang AST)
2. Once we test the clang parser, we can focus on cnd.apt. This package does
have some clang translation support with stuff such as compilation db, so
eventually it can be completely removed if successfully transitioned.
3. Finally a lot of code can be simplified. Particularly, the indexing is
happening at the same time as parsing. This needs to be moved over to the
Indexing API like it is done for java.
I hope that with this approach we can keep up with latest C++ language releases
while not having to sacrifice any of the functionality in the cnd module.
Siddhesh
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists