On Friday, 13 April 2012 at 09:57:49 UTC, Ary Manzana wrote:
Having a D compiler available as a library will (at least) give
these benefits:
1. Can be used by an IDE: D is statically typed and so an IDE
can benefit a lot from this. The features Descent had, as far
as I remember, were:
1.1. Outline
1.2. Autocompletion
1.3. Type Hierarchy
1.4. Syntax and semantic errors, showing not only the line
number but also column numbers if it makes sense
1.5. Automatic import inclusion (say, typing writefln and
getting a list of modules that provide that symbol)
1.6. Compile-time view: replace auto with the inferred
type, insert mixins into scope, rewrite operator overloads and
other lowerings (but I'm not sure this point is really useful)
1.7. Determine, given a set of versions and flags, which
branches of static ifs are used/unused
1.8. Open declaration
1.9. Show implementations (of an interface, of interface's
method or, abstract methods, or method overrides).
1.10. Propose to override a method (you type some letters
and then hit some key combination and get a list of methods to
override)
1.11. Get the code of a template when instantiated.
2. Can be used to build better doc generators: one that shows
known subclasses or interface implementation, shows inherited
methods, type hierarchy.
3. Can be used for lints and other such tools.
As you can see, a simple lexer/parser built into an IDE, doc
generator or lint will just give basic features but will never
achieve something exceptionally good if it lacks the full
semantic knowledge of the code.
I'll write a list of things I'd like this compiler-as-library
to have, but please help me make it bigger :-)
* Don't use global variables (DMD is just thought to be run
once, so when used as a library it can just be used, well, once)
* Provide a lexer which gives line numbers and column numbers
(beginning, end)
* Provide a parser with the same features
* The semantic phase should not discard any information found
while parsing. For example when DMD resolves a type it
recursively resolves aliasing and keeps the last one. An
example:
alias int foo;
alias foo* bar;
bar something() { ... }
It would be nice if "bar", after semantic analysis is done,
carries the information that bar is "foo*" and that "foo" is
"int". Also that something's return type is "bar", not "int*".
* Provide errors and warnings that have line numbers as well
as column numbers.
* Allow to parse the top-level definitions of a module. Whit
this I mean skipping function bodies. At least Descent first
built a the outline of the whole project by doing this. This
mode should also allow specifying a location as a target, and
if that location falls inside a function body then it's
contents are returned (useful when editing a file, so you can
get the outline as well as semantic info of the function
currently being edited, which will never affect semantic in
other parts of the module). This will dramatically speed up the
editor.
* Don't stop parsing on errors (I think DMD already does this).
* Provide a visitor class. If possible, use visitors to
implement semantic analysis. The visitor will make it super
easy to implement lints and to generate documentation.
By the way, I also started a project called <a
href="https://github.com/roman-d-boiko/DCT">The D Compiler Tools
(DCT)</a> about a month ago. It is provided under the Boost
license, and has the goal to enable building third-party tools
with functionality that would include described above. I'm trying
to build LLVM-based codegen and also reuse frontend in a separate
project with basic IDE functionality for D.
I have never implemented compilers before, and probably should
have called my project SDC if only that name had not been taken
before ;) Goals are very similar to those of SDC (especially now,
after its re-licensing). But I don't commit to ever finish the
project, because my free time is very limited :(.
There is SIGNIFICANTLY less functionality implemented at this
moment than in SDC. Currently, only primitive lexing is in place
(I follow the KISS principle where possible) and parsing of auto
declarations (auto i = 3 * (2 + 8), etc.) with stubs for most
other cases. (Please note that project Readme file is outdated.)
Parser is top-down recursive descent, and it follows
specification very closely, except some differences needed to
simplify implementation (like using loops to implement
left-recursion in specification).
Anyone interested in discussing DCT or participating in
development would be welcome!