Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Zach the Mystic Thu, 08 Mar 2012 08:25:09 -0800

On Thursday, 8 March 2012 at 07:49:57 UTC, Jonathan M Davis wrote:

The lexer is going to need to take a range of dchar (which mayor may not be an array),And while the lexer would need to operate on generic ranges ofdchar, it would probably have to be special-cased for stringsin a number of places

I know what you mean. I actually cut out ddmd's conversion stuffbecause I had glanced over phobos I saw plenty of functionsdesigned for this! I must have intuited what you are saying. dmddoes all conversion to char* prior to sending the buffer to thelexer. I doubt there's a reason to change this procedure, only toput that conversion code directly into module dmd.lexer instead.

The parser would then take a range of tokens and then outputthe AST in some form or other - it probably couldn't berange, but I'm not sure.


Dmd's AST is pretty idiosyncratic.

Example: class FuncDeclaration (function declaration ) has abunch of named members:

{
Identifier ident; // the function's name
Parameter[] parameters; // its parameters
Statement frequire; // the in{} contract, if present
Statement fbody; // function body
etc.

Each one has its own name. I actually was working on how to turnit into a more iterable format, since if you want to edit the ASTdirectly you're going to need to cursor down or up to the elementyou want. It's actually doable, but it's not a natural range-ishformat. That's where I'm confused about the licensing issues,since I'm not sure if the particular object structure which getsparsed is also going to be in phobos or if it must remain GPL,which I'm not sure I want to continue using.

So, if you're not familiar with ranges, you probably have afair bit oflearning ahead of you, and you're probably going to have tomake a number ofchanges to your lexer and parser (though the majority of itwill probably beable to stay intact). Unfortunately, a proper article andtutorial on them iscurrently lacking in spite of the fact that Phobos uses themheavily.Fortunately however, in a book that Ali Çehreli is writing onD, he has a
chapter on ranges that should help get you started:

http://ddili.org/ders/d.en/ranges.html
But I'd suggest that you play around with ranges a fair bit(especially withstrings) before trying to change what you have to use them.std.algorithm inparticular makes heavy use of ranges. And it wouldn't surpriseme at all ifsome portions of your lexer and parser really should be usingsome of Phobos'functions but isn't currently, because it's originally a portfrom C++. Youshould also make sure that you understand the basics of Unicodefairly well -especially with how they pertain to char, wchar, and dchar -since that willaffect your ability to correctly translate code to use rangesas well as
properly optimize them.
It would probably help if other D developers who are morefamiliar with rangestook a look at what you have and maybe even helped you startadjusting yourcode, but I don't know how many will both have the time and beinterested. IfI have time, I'll probably start poking at it, but I don't knowthat I'll have
time any time soon, much as I'd like to.
Regardless, you need to familiarize yourself with ranges if youwant to getthe lexer and parser ready for inclusion in Phobos. And youreally shouldfamiliarize yourself with them anyway, since they're heavilyused in D code ingeneral. Not being able to use ranges in D would be like notbeing able to useiterators in C++. You can program in it, but you'd be fairlycrippled -
particularly when dealing with the standard library.

- Jonathan M Davis

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Reply via email to