date:20120308

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Dmitry Olshansky


On 09.03.2012 1:12, Jonathan M Davis wrote:

On Friday, March 09, 2012 00:54:48 Dmitry Olshansky wrote:

On 08.03.2012 22:46, Jonathan M Davis wrote:

On Thursday, March 08, 2012 22:03:12 Dmitry Olshansky wrote:

On 08.03.2012 11:48, Jonathan M Davis wrote:

A range is not necessarily a dynamic array, though a dynamic array is a
range. The lexer is going to need to take a range of dchar (which may or
may not be an array), and it's probably going to need to return a range
of tokens. The parser would then take a range of tokens and then output
the AST in some form or other - it probably couldn't be range, but I'm
not sure. And while the lexer would need to operate on generic ranges of
dchar, it would probably have to be special-cased for strings in a
number
of places in order to make it faster (e.g. checking the first char in a
string rather than using front when it's known that the value being
checked against is an ASCII character and will therefore fit in a single
char - front has to decode the next character, which is less efficient).


Simply put, the decisison on decoding should belong to lexer. Thus
strings should be wrapped as input range of char, wchar&  dchar
respectively.


??? The normal way to handle this is to simply special-case certain
operations. e.g.

static if(Unqual!(isElementEncodingType!R) == char)
{ ... }


Does isElementEncodingType aware of anything other then string/wstring?


No. It uses isNarrowString. So, what you'd end up doing is having the lexer
accept a generic range of dchar and then have specializations where
appropriate for narrow strings. Nothing in Phobos' string handling really
supports the idea of dealing with generic char or wchar ranges. It all
processes ranges of dchar and specializes on narrow strings where appropriate.



The concept that *only* strings need special casing is broken. I recall 
somebody already stomped on this: i.e. any range that returns char (a 
wrapper range, over say memory-mapped file) passes by all the intricate 
special casing that does decoding of std.algorithm and friends. So to 
put it simply there is no way to tap into this *decoding by default* 
infrastructure.



But is there really a use case for a generic range of char or wchar? I don't
know. In general, I really don't think that there is.


Memory mapped file wrapper is one, it's just from the top of my head. 
There could be plenty of them, one needs just to look. 'I don't

 know' is not a solid argument, sorry.

When dealing with ranges

of characters, they're essentially always either strings or strings which have
been wrapped in other ranges (generally by calling some other range-based
function such as take or filter). And those range-based functions pretty much
inevitably need to treat the strings as ranges of dchar to do what they do
(potentially with specific optimizations for strings). They aren't designed to
operate on ranges of char or wchar, and the only reason to make them do so
would be if there were a use case where you needed a range of char or wchar
which was not an array. But they're all either arrays or wrapped arrays. So,
no such use case currently exists with Phobos, and I really question the
usefulness of trying to optimize on a generic range of char or wchar -
especially when many of the optimizations for arrays involve random access
ranges, and if you have a random access range of char or wchar, I _really_
question that it would ever be anything other than an array.



And that is problem, if you fail to see why we need to stop pretending 
that all char/wchar ranges are arrays, then I failed somewhere. In the 
same vane one would argue that there is no other random access range but 
array, yet there is. For instance if we explore containers there could 
be Tries, Dequeues  and whatnot of char/wchar with their respective ranges.



So, I'd advise against trying to operate on ranges of char or wchar and just
stick to operating on ranges of dchar with optimizations for narrow strings
where appropriate.


Probably the fact that in lexer it's not 'some place for optimizations, 
it's the whole thing' was missed. That's why I'm looking for more or 
less generic yet efficient way.





Now, speaking outside of this specific problem.
Basically I would propose formalizing a kind of range that current
string/wstring is. And that is a VariableLengthEncoding range (VLE
range), a two in one - random access codeunit range and bidirectional
'codepoint' range. I've heard of attempts on this concept before, but
now with a use case at hand it could be become more important.


There has been some talk of VLE ranges but really only with regards to the
fact that strings are a case of that. Nothing has been done to generalize it.
It may be something that should be looked into, but until we've formalized
that, I don't think that something like the lexer should be built around the
idea. It would be safer to stick with what we've been doing - operating on
ranges of dchar and special-cas

Re: D in Academia

2012-03-08 Thread Brian Rogoff


On Saturday, 3 March 2012 at 19:02:23 UTC, Chuck Allison wrote:

FYI:

TDPL is a required text for CS 4450, Analysis of Programming 
Languages, at Utah Valley University starting Fall 2012. We'll 
study ML and D (and Prolog if time allows).


Slightly off topic digression: Which ML?

I ask as an OCaml partisan (and D fan!) who will obviously 
suggest that you go with that language, since it is probably the 
more widely used ML variant.


-- Brian

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Zach the Mystic


On Thursday, 8 March 2012 at 07:49:57 UTC, Jonathan M Davis wrote:
Regardless, you need to familiarize yourself with ranges if you 
want to get

the lexer and parser ready for inclusion in Phobos.


I have to admit that I don't currently feel competent to do this 
work. I'm too green. But I do think that the program is already 
very close to what is needed. I also would assume that those 
experienced with dmd's C++ incarnation will find themselves on 
rather familiar soil if they set to work on the D version, so 
that might speed things up a little if they also know about 
ranges.

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Zach the Mystic


On Thursday, 8 March 2012 at 19:36:32 UTC, Jacob Carlborg wrote:
This is great news. I was really worried that the license was 
etched in
stone. I'll need help finding out who owns the code, plus 
legal advice
if the process is more than just getting a simple confirmation 
email

from each of the original authors.


You have my blessing for the small changes I contributed with.


Great!

Re: Enhanced D syntax highlighting for Sublime Text 2

2012-03-08 Thread Byakkun

On Wed, 07 Mar 2012 20:59:36 +0200, Alex Rønne Petersen  
 wrote:



https://github.com/alexrp/st2-d

I plan to have it merged into ST2 proper if I can somehow get in touch  
with the dev(s)...





Thanks, and good luck with the merge. I find ST2 the first text editor  
that is truly

beautiful.

--
Byakkun

Re: D to Javascript converter (a hacked up dmd)

2012-03-08 Thread Marco Leise


Am 01.03.2012, 17:29 Uhr, schrieb Robert Clipsham :


Interesting idea:

version(JavaScript) asm
{
 /* Well... it's not really asm, but it's
as close as you could get with a javascript backend :p */
 document.write("Oh look, inline javascript D-:");
}

Perhaps it would be better to have a dedicated function for that though.


How about:

pragma(js)
{
 document.write("Oh look, inline javascript D-:");
}

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Jonathan M Davis

On Friday, March 09, 2012 00:54:48 Dmitry Olshansky wrote:
> On 08.03.2012 22:46, Jonathan M Davis wrote:
> > On Thursday, March 08, 2012 22:03:12 Dmitry Olshansky wrote:
> >> On 08.03.2012 11:48, Jonathan M Davis wrote:
> >>> A range is not necessarily a dynamic array, though a dynamic array is a
> >>> range. The lexer is going to need to take a range of dchar (which may or
> >>> may not be an array), and it's probably going to need to return a range
> >>> of tokens. The parser would then take a range of tokens and then output
> >>> the AST in some form or other - it probably couldn't be range, but I'm
> >>> not sure. And while the lexer would need to operate on generic ranges of
> >>> dchar, it would probably have to be special-cased for strings in a
> >>> number
> >>> of places in order to make it faster (e.g. checking the first char in a
> >>> string rather than using front when it's known that the value being
> >>> checked against is an ASCII character and will therefore fit in a single
> >>> char - front has to decode the next character, which is less efficient).
> >> 
> >> Simply put, the decisison on decoding should belong to lexer. Thus
> >> strings should be wrapped as input range of char, wchar& dchar
> >> respectively.
> > 
> > ??? The normal way to handle this is to simply special-case certain
> > operations. e.g.
> > 
> > static if(Unqual!(isElementEncodingType!R) == char)
> > { ... }
> 
> Does isElementEncodingType aware of anything other then string/wstring?

No. It uses isNarrowString. So, what you'd end up doing is having the lexer 
accept a generic range of dchar and then have specializations where 
appropriate for narrow strings. Nothing in Phobos' string handling really 
supports the idea of dealing with generic char or wchar ranges. It all 
processes ranges of dchar and specializes on narrow strings where appropriate.

But is there really a use case for a generic range of char or wchar? I don't 
know. In general, I really don't think that there is. When dealing with ranges 
of characters, they're essentially always either strings or strings which have 
been wrapped in other ranges (generally by calling some other range-based 
function such as take or filter). And those range-based functions pretty much 
inevitably need to treat the strings as ranges of dchar to do what they do 
(potentially with specific optimizations for strings). They aren't designed to 
operate on ranges of char or wchar, and the only reason to make them do so 
would be if there were a use case where you needed a range of char or wchar 
which was not an array. But they're all either arrays or wrapped arrays. So, 
no such use case currently exists with Phobos, and I really question the 
usefulness of trying to optimize on a generic range of char or wchar - 
especially when many of the optimizations for arrays involve random access 
ranges, and if you have a random access range of char or wchar, I _really_ 
question that it would ever be anything other than an array.

So, I'd advise against trying to operate on ranges of char or wchar and just 
stick to operating on ranges of dchar with optimizations for narrow strings 
where appropriate.

> Now, speaking outside of this specific problem.
> Basically I would propose formalizing a kind of range that current
> string/wstring is. And that is a VariableLengthEncoding range (VLE
> range), a two in one - random access codeunit range and bidirectional
> 'codepoint' range. I've heard of attempts on this concept before, but
> now with a use case at hand it could be become more important.

There has been some talk of VLE ranges but really only with regards to the 
fact that strings are a case of that. Nothing has been done to generalize it. 
It may be something that should be looked into, but until we've formalized 
that, I don't think that something like the lexer should be built around the 
idea. It would be safer to stick with what we've been doing - operating on 
ranges of dchar and special-casing for narrow strings where appropriate. If 
need be, it should be possible to adjust it to use VLEs later.

> The problem is, I think, that current InputRange range is insufficent as
> it requres to calculate length of first element twice: one time in front
> and extra one in popFront.

Perhaps an optional function should be added to input ranges which both 
returns and front and pops it. But VLE ranges such as narrow strings are 
probably the only ones which would really care, and we can already special 
case for strings, so it would probably only be of real value for general VLEs. 
However, since VLEs would arguably be a new range type, they could just 
implement the new function, and anything which special-cases VLEs can take 
advantage of it, while they still work just fine as input ranges. We could even 
add a free function which both pops and returns front which uses front and 
popFront for most ranges and the new function for VLEs, so you can code with 
that free function in

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Dmitry Olshansky


On 08.03.2012 22:46, Jonathan M Davis wrote:

On Thursday, March 08, 2012 22:03:12 Dmitry Olshansky wrote:

On 08.03.2012 11:48, Jonathan M Davis wrote:

A range is not necessarily a dynamic array, though a dynamic array is a
range. The lexer is going to need to take a range of dchar (which may or
may not be an array), and it's probably going to need to return a range
of tokens. The parser would then take a range of tokens and then output
the AST in some form or other - it probably couldn't be range, but I'm
not sure. And while the lexer would need to operate on generic ranges of
dchar, it would probably have to be special-cased for strings in a number
of places in order to make it faster (e.g. checking the first char in a
string rather than using front when it's known that the value being
checked against is an ASCII character and will therefore fit in a single
char - front has to decode the next character, which is less efficient).


Simply put, the decisison on decoding should belong to lexer. Thus
strings should be wrapped as input range of char, wchar&  dchar
respectively.


??? The normal way to handle this is to simply special-case certain
operations. e.g.

static if(Unqual!(isElementEncodingType!R) == char)
{ ... }


Does isElementEncodingType aware of anything other then string/wstring?


else
{ ... }

I'm not sure that wrapping char and wchar arrays in structs that treat them as
ranges of char or wchar is a good idea. At minimum, I'm not aware of anything
in Phobos currently working that way (unless you did something like that in
std.regex?).


Well it's not that I'm insisting of _wrapping_ the arrays or some such 
in general sense.  And yes, I had some hard experience in std.regex with 
UTF decoding and range design that doesn't exactly fits.


What I'm truly against is going overly generic and automatically 
stomping on your performance. That being said the general design of 
string ranges has unnessary pessimization already as it's popFront does 
a UTF length lookup, identical operation is  performed when decoding the 
first codepoint (.front).
At any rate building lexer on top of ranges in current setting means 
using abstraction that _hides_ decoding inside. That's a bad idea, it's 
loosing a fight from the start. Why? Because in this case range can't 
know if decoding is needed at this particular state of lexer or not, it 
is generality that kills this by definition.


Yeah, in general auto-magically decoding ranges are OK, as long as the 
stuff you do has cost much higher then decoding (~10 times would be 
fine) or things that you strictly can't do otherwise. Lexer doesn't fall 
into this criteria.


Everything either treats them as generic ranges of dchar or

special cases them. And when you want to be most efficient with string
processing, I would think that you'd want to treat them exactly as the arrays
of code units that they are rather than ranges - in which case treating them
as generic ranges of dchar in most places and then special casing certain
sections of code which can take advantage of the fact that they're arrays of
code units seems like the way to go.



Yeah, no arguing that. The thing is that lexer as a whole is precisely 
one of these special cases. It's good as long as it's fast and that 
requires more control then "a generic input range of dchar".


Now, speaking outside of this specific problem.
Basically I would propose formalizing a kind of range that current 
string/wstring is. And that is a VariableLengthEncoding range (VLE 
range), a two in one - random access codeunit range and bidirectional 
'codepoint' range. I've heard of attempts on this concept before, but 
now with a use case at hand it could be become more important.
The problem is, I think, that current InputRange range is insufficent as 
it requres to calculate length of first element twice: one time in front 
and extra one in popFront.




--
Dmitry Olshansky

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Jacob Carlborg


On 2012-03-08 09:20, Jonathan M Davis wrote:

On Thursday, March 08, 2012 09:11:03 Jacob Carlborg wrote:

On 2012-03-08 05:54, Jonathan M Davis wrote:

On Thursday, March 08, 2012 03:12:48 Zach the Mystic wrote:

On Thursday, 8 March 2012 at 01:43:26 UTC, Daniel Murphy wrote:

"Zach the Mystic"   wrote in
message
news:afqmbmvuvizvgfooe...@forum.dlang.org...


I'll gladly put a license on it if the leaders of the
community tell me which one to use ( Artistic, libpng, Boost ).

Zach


It will need to be the same license as the frontend
(GPL/Artistic).  It
should be at the top of each c++ source file.


It looks like the license is going to have to be GPL because it
says so strictly in dmd's readme.txt. Somehow that license scares
me, though. The "Free Software Foundation" seems like a very
Orwellian institution to me. I hope it doesn't scare users away.


If you took it from ddmd, then it's definitely going to have to be GPL.

Now, there is interest in having a D parser and lexer in Phobos. I don't
know if your version will fit the bill (e.g. it must have a range-based
API), but we need one at some point. The original idea was to more or
less directly port dmd's lexer and parser with some adjustments to the
API as necessary (primarily to make it range-based). But no one has had
the time to complete such a project yet (I originally volunteered to do
it, but I just haven't had the time).

When that project was proposed, Walter agreed to let that port be Boost
rather than GPL (since he holds the copyright and the port would be going
in Phobos, which uses boost).

The problem with what you have (even if the API and implementation were
perfect) is that it comes from ddmd, which had other contributors working
on it. So, you would have to get permission from not only Walter but all
of the relevant ddmd contributors. If you were able to _that_, and it
could get passed the review process, then what you've done could be put
into Phobos. But that requires that you take the time and effort to take
care of getting the appropriate permissions, making sure that the API and
implementation are acceptable for Phobos, and putting it through the
Phobos review process. It would be great if you could do that though.

- Jonathan M Davis


It would be nice if the frontend written in D (which ever it will be)
could be used by DMD. Then there wouldn't be any problems of being out
of sync.


Well, having it be easier to keep in sync is one of the reasons that it was
originally proposed to simply port the lexer in dmd's frontend to D. But
having it _be_ the frontend of dmd would be even better. I don't know how
interested Walter is or isn't in that, but having it be a port of the current
frontend would likely make convincing him easier than it would be if it were
one written from scratch.

- Jonathan M davis


Exactly, that is what I'm thinking as well. Maybe it then can gradually 
be modified to better support an API useful for other things than just 
the compiler. Or building an higher level API on top of it that the 
compiler may not need to use.


--
/Jacob Carlborg

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Jacob Carlborg


On 2012-03-08 08:21, Zach the Mystic wrote:

On Thursday, 8 March 2012 at 04:56:07 UTC, Jonathan M Davis wrote:

If you took it from ddmd, then it's definitely going to have to be GPL.

Now, there is interest in having a D parser and lexer in Phobos. I
don't know
if your version will fit the bill (e.g. it must have a range-based
API), but we
need one at some point. The original idea was to more or less directly
port
dmd's lexer and parser with some adjustments to the API as necessary
(primarily to make it range-based). But no one has had the time to
complete
such a project yet (I originally volunteered to do it, but I just
haven't had
the time).

When that project was proposed, Walter agreed to let that port be
Boost rather
than GPL (since he holds the copyright and the port would be going in
Phobos,
which uses boost).

The problem with what you have (even if the API and implementation were
perfect) is that it comes from ddmd, which had other contributors
working on
it. So, you would have to get permission from not only Walter but all
of the
relevant ddmd contributors. If you were able to _that_, and it could get
passed the review process, then what you've done could be put into
Phobos. But
that requires that you take the time and effort to take care of
getting the
appropriate permissions, making sure that the API and implementation are
acceptable for Phobos, and putting it through the Phobos review
process. It
would be great if you could do that though.

- Jonathan M Davis


This is great news. I was really worried that the license was etched in
stone. I'll need help finding out who owns the code, plus legal advice
if the process is more than just getting a simple confirmation email
from each of the original authors.


You have my blessing for the small changes I contributed with.


--
/Jacob Carlborg

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Jonathan M Davis

On Thursday, March 08, 2012 22:03:12 Dmitry Olshansky wrote:
> On 08.03.2012 11:48, Jonathan M Davis wrote:
> > A range is not necessarily a dynamic array, though a dynamic array is a
> > range. The lexer is going to need to take a range of dchar (which may or
> > may not be an array), and it's probably going to need to return a range
> > of tokens. The parser would then take a range of tokens and then output
> > the AST in some form or other - it probably couldn't be range, but I'm
> > not sure. And while the lexer would need to operate on generic ranges of
> > dchar, it would probably have to be special-cased for strings in a number
> > of places in order to make it faster (e.g. checking the first char in a
> > string rather than using front when it's known that the value being
> > checked against is an ASCII character and will therefore fit in a single
> > char - front has to decode the next character, which is less efficient).
> 
> Simply put, the decisison on decoding should belong to lexer. Thus
> strings should be wrapped as input range of char, wchar & dchar
> respectively.

??? The normal way to handle this is to simply special-case certain 
operations. e.g.

static if(Unqual!(isElementEncodingType!R) == char)
{ ... }
else
{ ... }

I'm not sure that wrapping char and wchar arrays in structs that treat them as 
ranges of char or wchar is a good idea. At minimum, I'm not aware of anything 
in Phobos currently working that way (unless you did something like that in 
std.regex?). Everything either treats them as generic ranges of dchar or 
special cases them. And when you want to be most efficient with string 
processing, I would think that you'd want to treat them exactly as the arrays 
of code units that they are rather than ranges - in which case treating them 
as generic ranges of dchar in most places and then special casing certain 
sections of code which can take advantage of the fact that they're arrays of 
code units seems like the way to go. The lexer is then choosing when something 
decodes, though the default is to decode, since it requires special-casing to 
avoid it.

- Jonathan M Davis

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Dmitry Olshansky


On 08.03.2012 11:48, Jonathan M Davis wrote:

On Thursday, March 08, 2012 08:21:17 Zach the Mystic wrote:

On Thursday, 8 March 2012 at 04:56:07 UTC, Jonathan M Davis wrote:

If you took it from ddmd, then it's definitely going to have to
be GPL.

Now, there is interest in having a D parser and lexer in
Phobos. I don't know
if your version will fit the bill (e.g. it must have a
range-based API), but we
need one at some point. The original idea was to more or less
directly port
dmd's lexer and parser with some adjustments to the API as
necessary
(primarily to make it range-based). But no one has had the time
to complete
such a project yet (I originally volunteered to do it, but I
just haven't had
the time).

When that project was proposed, Walter agreed to let that port
be Boost rather
than GPL (since he holds the copyright and the port would be
going in Phobos,
which uses boost).

The problem with what you have (even if the API and
implementation were
perfect) is that it comes from ddmd, which had other
contributors working on
it. So, you would have to get permission from not only Walter
but all of the
relevant ddmd contributors. If you were able to _that_, and it
could get
passed the review process, then what you've done could be put
into Phobos. But
that requires that you take the time and effort to take care of
getting the
appropriate permissions, making sure that the API and
implementation are
acceptable for Phobos, and putting it through the Phobos review
process. It
would be great if you could do that though.

- Jonathan M Davis


This is great news. I was really worried that the license was
etched in stone. I'll need help finding out who owns the code,
plus legal advice if the process is more than just getting a
simple confirmation email from each of the original authors.

I have some comments I feel are very interesting regarding the
lexer and pointers. There are no pointers in any of the code
besides the lexer, so I think that will be very satisfying to
you. Now I don't know everything about ranges, but if you simply
mean dynamic arrays, then yes, everything except the lexer uses
arrays when necessary, although there's simply a lot of code
which doesn't need them because most of the objects are really
just lists of members, many of which are not arrays.

About the lexer, one thing I realized about the Wild-West pointer
style as I was porting it is that it must be blazing fast. To my
understanding, to call p.popFront() requires two operations, ++p;
followed by --p.length; plus possibly array bounds checking, I
don't know.

++p is all that the current lexer needs. It used to only check
for EOF at each junction, but since I'm parsing little chunks of
code instead of whole files now, it checks "if ( p>= endBuf )"
at the beginning of each token scan, which gets pretty close to
not going out of bounds, since most tokens aren't very long. That
lexer is a tribute to very fast programming of an old school
which will go away if it changes. Still, I can sense a tidal wave
of RANGES coming, and I fear I'll just have to bid the little
thing goodbye! :-(


A range is not necessarily a dynamic array, though a dynamic array is a range.
The lexer is going to need to take  a range of dchar (which may or may not be
an array), and it's probably going  to need to return a range of tokens. The
parser would then take a range of  tokens and then output the AST in some form
or other - it probably couldn't be  range, but I'm not sure. And while the
lexer would need to operate on generic ranges of dchar, it would probably have
to be special-cased for strings in a number of places in order to make it
faster (e.g. checking the first char in a string rather than using front when
it's known that the value being checked against is an ASCII character and will
therefore fit in a single char - front has to decode the next character, which
is less efficient).


Simply put, the decisison on decoding should belong to lexer. Thus 
strings should be wrapped as input range of char, wchar & dchar 
respectively.




So, if you're not familiar with ranges, you probably have a fair bit of
learning ahead of you, and you're probably going to have to make a number of
changes to your lexer and parser (though the majority of it will probably be
able to stay intact). Unfortunately, a proper article and tutorial on them is
currently lacking in spite of the fact that Phobos uses them heavily.
Fortunately however, in a book that Ali Çehreli is writing on D, he has a
chapter on ranges that should help get you started:

http://ddili.org/ders/d.en/ranges.html

But I'd suggest that you play around with ranges a fair bit (especially with
strings) before trying to change what you have to use them. std.algorithm in
particular makes heavy use of ranges. And it wouldn't surprise me at all if
some portions of your lexer and parser really should be using some of Phobos'
functions but isn't currently, because it's originally a port from C++. You
should also mak

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Zach the Mystic


I hit "send" before I was done writing!

On Thursday, 8 March 2012 at 07:49:57 UTC, Jonathan M Davis wrote:
Fortunately however, in a book that Ali Çehreli is writing on 
D, he has a

chapter on ranges that should help get you started:

http://ddili.org/ders/d.en/ranges.html


Thanks. This is a really helpful guide.

Okay I'm done!

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Zach the Mystic


On Thursday, 8 March 2012 at 07:49:57 UTC, Jonathan M Davis wrote:
The lexer is going to need to take a range of dchar (which may 
or may not be an array),
And while the lexer would need to operate on generic ranges of 
dchar, it would probably have to be special-cased for strings 
in a number of places


I know what you mean. I actually cut out ddmd's conversion stuff 
because I had glanced over phobos I saw plenty of functions 
designed for this! I must have intuited what you are saying. dmd 
does all conversion to char* prior to sending the buffer to the 
lexer. I doubt there's a reason to change this procedure, only to 
put that conversion code directly into module dmd.lexer instead.


The parser would then take a range of  tokens and then output 
the AST in some form  or other - it probably couldn't be  
range, but I'm not sure.


Dmd's AST is pretty idiosyncratic.
Example: class FuncDeclaration (function declaration ) has a 
bunch of named members:

{
Identifier ident; // the function's name
Parameter[] parameters; // its parameters
Statement frequire; // the in{} contract, if present
Statement fbody; // function body
etc.

Each one has its own name. I actually was working on how to turn 
it into a more iterable format, since if you want to edit the AST 
directly you're going to need to cursor down or up to the element 
you want. It's actually doable, but it's not a natural range-ish 
format. That's where I'm confused about the licensing issues, 
since I'm not sure if the particular object structure which gets 
parsed is also going to be in phobos or if it must remain GPL, 
which I'm not sure I want to continue using.



So, if you're not familiar with ranges, you probably have a 
fair bit of
learning ahead of you, and you're probably going to have to 
make a number of
changes to your lexer and parser (though the majority of it 
will probably be
able to stay intact). Unfortunately, a proper article and 
tutorial on them is
currently lacking in spite of the fact that Phobos uses them 
heavily.
Fortunately however, in a book that Ali Çehreli is writing on 
D, he has a

chapter on ranges that should help get you started:

http://ddili.org/ders/d.en/ranges.html

But I'd suggest that you play around with ranges a fair bit 
(especially with
strings) before trying to change what you have to use them. 
std.algorithm in
particular makes heavy use of ranges. And it wouldn't surprise 
me at all if
some portions of your lexer and parser really should be using 
some of Phobos'
functions but isn't currently, because it's originally a port 
from C++. You
should also make sure that you understand the basics of Unicode 
fairly well -
especially with how they pertain to char, wchar, and dchar - 
since that will
affect your ability to correctly translate code to use ranges 
as well as

properly optimize them.

It would probably help if other D developers who are more 
familiar with ranges
took a look at what you have and maybe even helped you start 
adjusting your
code, but I don't know how many will both have the time and be 
interested. If
I have time, I'll probably start poking at it, but I don't know 
that I'll have

time any time soon, much as I'd like to.

Regardless, you need to familiarize yourself with ranges if you 
want to get
the lexer and parser ready for inclusion in Phobos. And you 
really should
familiarize yourself with them anyway, since they're heavily 
used in D code in
general. Not being able to use ranges in D would be like not 
being able to use
iterators in C++. You can program in it, but you'd be fairly 
crippled -

particularly when dealing with the standard library.

- Jonathan M Davis

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Zach the Mystic


On Thursday, 8 March 2012 at 04:53:20 UTC, Zach the Mystic wrote:
Anyway, the first thing I need is a gui, and a code generator 
capable of coloring its output appropriately, so I'm working on 
that, but it's not (even close to) ready for show yet!


By "Code Generator" I actually mean pretty-printer, a distinction 
I was recently made aware of.

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Zach the Mystic


On Thursday, 8 March 2012 at 09:05:05 UTC, David Nadlinger wrote:
On Wednesday, 7 March 2012 at 20:02:57 UTC, Zach the Mystic 
wrote:

https://github.com/zachthemystic/ddmd-clean/


By the way, in compilers, »code generation« is commonly used 
to refer to the generation of machine code; so using the term 
to refer to .di file generation/pretty-printing could be 
misleading for some (it was for me, at least^^).


David


I'm sorry. I see your point. I guess I'll call it 
pretty-printing, although it also occurred to me to call it 
"disparsing", which then humorously led to the notion 
"dyslexing". Wait, did I spell that right? :-)

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread James Miller

On 8 March 2012 22:05, David Nadlinger  wrote:
> On Wednesday, 7 March 2012 at 20:02:57 UTC, Zach the Mystic wrote:
>>
>> https://github.com/zachthemystic/ddmd-clean/
>

I would like to see the parser output an AST for use in other
situations. It would be nice to have a tool that can analyse the AST
and do thing with it (like autocompletion). Clang has done some pretty
cool things in this respect, to the point that it can practically do
code completion itself.

--
James Miller

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread David Nadlinger


On Wednesday, 7 March 2012 at 20:02:57 UTC, Zach the Mystic wrote:

https://github.com/zachthemystic/ddmd-clean/


By the way, in compilers, »code generation« is commonly used to 
refer to the generation of machine code; so using the term to 
refer to .di file generation/pretty-printing could be misleading 
for some (it was for me, at least^^).


David

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Jonathan M Davis

On Thursday, March 08, 2012 09:11:03 Jacob Carlborg wrote:
> On 2012-03-08 05:54, Jonathan M Davis wrote:
> > On Thursday, March 08, 2012 03:12:48 Zach the Mystic wrote:
> >> On Thursday, 8 March 2012 at 01:43:26 UTC, Daniel Murphy wrote:
> >>> "Zach the Mystic"  wrote in
> >>> message
> >>> news:afqmbmvuvizvgfooe...@forum.dlang.org...
> >>> 
>  I'll gladly put a license on it if the leaders of the
>  community tell me which one to use ( Artistic, libpng, Boost ).
>  
>  Zach
> >>> 
> >>> It will need to be the same license as the frontend
> >>> (GPL/Artistic).  It
> >>> should be at the top of each c++ source file.
> >> 
> >> It looks like the license is going to have to be GPL because it
> >> says so strictly in dmd's readme.txt. Somehow that license scares
> >> me, though. The "Free Software Foundation" seems like a very
> >> Orwellian institution to me. I hope it doesn't scare users away.
> > 
> > If you took it from ddmd, then it's definitely going to have to be GPL.
> > 
> > Now, there is interest in having a D parser and lexer in Phobos. I don't
> > know if your version will fit the bill (e.g. it must have a range-based
> > API), but we need one at some point. The original idea was to more or
> > less directly port dmd's lexer and parser with some adjustments to the
> > API as necessary (primarily to make it range-based). But no one has had
> > the time to complete such a project yet (I originally volunteered to do
> > it, but I just haven't had the time).
> > 
> > When that project was proposed, Walter agreed to let that port be Boost
> > rather than GPL (since he holds the copyright and the port would be going
> > in Phobos, which uses boost).
> > 
> > The problem with what you have (even if the API and implementation were
> > perfect) is that it comes from ddmd, which had other contributors working
> > on it. So, you would have to get permission from not only Walter but all
> > of the relevant ddmd contributors. If you were able to _that_, and it
> > could get passed the review process, then what you've done could be put
> > into Phobos. But that requires that you take the time and effort to take
> > care of getting the appropriate permissions, making sure that the API and
> > implementation are acceptable for Phobos, and putting it through the
> > Phobos review process. It would be great if you could do that though.
> > 
> > - Jonathan M Davis
> 
> It would be nice if the frontend written in D (which ever it will be)
> could be used by DMD. Then there wouldn't be any problems of being out
> of sync.

Well, having it be easier to keep in sync is one of the reasons that it was 
originally proposed to simply port the lexer in dmd's frontend to D. But 
having it _be_ the frontend of dmd would be even better. I don't know how 
interested Walter is or isn't in that, but having it be a port of the current 
frontend would likely make convincing him easier than it would be if it were 
one written from scratch.

- Jonathan M davis

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Jacob Carlborg


On 2012-03-08 05:54, Jonathan M Davis wrote:

On Thursday, March 08, 2012 03:12:48 Zach the Mystic wrote:

On Thursday, 8 March 2012 at 01:43:26 UTC, Daniel Murphy wrote:

"Zach the Mystic"  wrote in
message
news:afqmbmvuvizvgfooe...@forum.dlang.org...


I'll gladly put a license on it if the leaders of the
community tell me which one to use ( Artistic, libpng, Boost ).

Zach


It will need to be the same license as the frontend
(GPL/Artistic).  It
should be at the top of each c++ source file.


It looks like the license is going to have to be GPL because it
says so strictly in dmd's readme.txt. Somehow that license scares
me, though. The "Free Software Foundation" seems like a very
Orwellian institution to me. I hope it doesn't scare users away.


If you took it from ddmd, then it's definitely going to have to be GPL.

Now, there is interest in having a D parser and lexer in Phobos. I don't know
if your version will fit the bill (e.g. it must have a range-based API), but we
need one at some point. The original idea was to more or less directly port
dmd's lexer and parser with some adjustments to the API as necessary
(primarily to make it range-based). But no one has had the time to complete
such a project yet (I originally volunteered to do it, but I just haven't had
the time).

When that project was proposed, Walter agreed to let that port be Boost rather
than GPL (since he holds the copyright and the port would be going in Phobos,
which uses boost).

The problem with what you have (even if the API and implementation were
perfect) is that it comes from ddmd, which had other contributors working on
it. So, you would have to get permission from not only Walter but all of the
relevant ddmd contributors. If you were able to _that_, and it could get
passed the review process, then what you've done could be put into Phobos. But
that requires that you take the time and effort to take care of getting the
appropriate permissions, making sure that the API and implementation are
acceptable for Phobos, and putting it through the Phobos review process. It
would be great if you could do that though.

- Jonathan M Davis


It would be nice if the frontend written in D (which ever it will be) 
could be used by DMD. Then there wouldn't be any problems of being out 
of sync.


--
/Jacob Carlborg

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

2012-03-08 Thread Jacob Carlborg


On 2012-03-07 21:02, Zach the Mystic wrote:

Check it out:
https://github.com/zachthemystic/ddmd-clean/

This program is an adaptation of the work done by the ddmd team:
http://www.dsource.org/projects/ddmd

I described most of it in the README. I hope it runs smoothly for you. I
only ran it on MAC OSX, and I don't know much about github or about how
to get things running smoothly for others.

Don't expect miracles. The parser is NOT up to date, e.g. => with the
lastest lambda syntax.

I'll gladly put a license on it if the leaders of the community tell me
which one to use ( Artistic, libpng, Boost ).

Onward and upward to IDE functionality!

Zach


This looks awesome.

--
/Jacob Carlborg

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D in Academia

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: Enhanced D syntax highlighting for Sublime Text 2

Re: D to Javascript converter (a hacked up dmd)

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Re: D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

21 matches

Site Navigation

Mail list logo

Footer information