On 2012-08-01 22:20, Jonathan M Davis wrote:
If you want really good performance out of a range-based solution operating on
ranges of dchar, then you need to special case for the built-in string types
all over the place, and if you have to wrap them in other range types
(generally because of
On 2012-08-02 00:23, David wrote:
I think the best way here is to define a BufferedRange that takes any
other range and supplies a buffer for it (with the appropriate
primitives) in a native array.
Andrei
Don't you think, this range stuff is overdone? Define some fancy Range
stuff, if an
On Thursday, August 02, 2012 08:18:39 Jacob Carlborg wrote:
On 2012-08-01 22:20, Jonathan M Davis wrote:
If you want really good performance out of a range-based solution
operating on ranges of dchar, then you need to special case for the
built-in string types all over the place, and if you
On 2012-08-02 08:26, Jonathan M Davis wrote:
It's really not all that hard to special case for strings, especially when
you're operating primarily on code units. And I think that the lexer should be
flexible enough to be usable with ranges other than strings. We're trying to
make most stuff in
On Thursday, August 02, 2012 08:51:26 Jacob Carlborg wrote:
On 2012-08-02 08:26, Jonathan M Davis wrote:
It's really not all that hard to special case for strings, especially when
you're operating primarily on code units. And I think that the lexer
should be flexible enough to be usable
On Wed, 2012-08-01 at 07:56 -0400, Andrei Alexandrescu wrote:
[….]
Well this doesn't do a lot in the way of substantiating. I do want to be
illuminated. I want to get DVCS! And my understanding is that we need to
branch whenever we plan a new release, and cherry-pick bugfixes from the
On 2012-08-02 09:43, Jonathan M Davis wrote:
For instance, I have this function which I use to generate a mixin any time
that I want to get the first code unit:
string declareFirst(R)()
if(isForwardRange!R is(Unqual!(ElementType!R) == dchar))
{
static if(isNarrowString!R)
On 8/2/12 3:43 AM, Jonathan M Davis wrote:
A range-based function operating on strings without special-casing them often
_will_ harm performance. But if you special-case them for strings, then you
can avoid that performance penalty - especially if you can avoid having to
decode any characters.
On 8/2/12 8:47 AM, Andrei Alexandrescu wrote:
On 8/2/12 5:00 AM, Russel Winder wrote:
On Wed, 2012-08-01 at 07:56 -0400, Andrei Alexandrescu wrote:
[….]
Well this doesn't do a lot in the way of substantiating. I do want to be
illuminated. I want to get DVCS! And my understanding is that we
On 8/2/12 5:00 AM, Russel Winder wrote:
On Wed, 2012-08-01 at 07:56 -0400, Andrei Alexandrescu wrote:
[….]
Well this doesn't do a lot in the way of substantiating. I do want to be
illuminated. I want to get DVCS! And my understanding is that we need to
branch whenever we plan a new release, and
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
http://ftp.digitalmars.com/dmd.1.075.zip
http://www.digitalmars.com/d/2.0/changelog.html
https://github.com/downloads/D-Programming-Language/dmd/dmd.2.060.zip
Walter Bright:
Another big pile of bug fixes. More contributors than ever!
And there is the first step of this change too:
http://d.puremagic.com/issues/show_bug.cgi?id=6652
Bye,
bearophile
On 02-08-2012 21:40, Peter Alexander wrote:
Nice update, but broke Derelict2 :-(
Regression: delegates with default arguments are broken (worked in 2.059)
void foo(void delegate(int x = 0) fun)
{
fun(); // Error: expected 1 function arguments, not 0
}
I think it was decided that this
On Thursday, 2 August 2012 at 19:19:04 UTC, Walter Bright wrote:
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
http://ftp.digitalmars.com/dmd.1.075.zip
http://www.digitalmars.com/d/2.0/changelog.html
On 8/2/12, Walter Bright newshou...@digitalmars.com wrote:
Known issue, it's an inevitable result (it never worked right anyway):
http://d.puremagic.com/issues/show_bug.cgi?id=8454
P.S. You might want to monitor the beta releases.
I've posted about that exact Derelict case in Issue 3866
On 02-08-2012 21:18, Walter Bright wrote:
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
http://ftp.digitalmars.com/dmd.1.075.zip
http://www.digitalmars.com/d/2.0/changelog.html
Memory usage of my program when compiled by dmd2.057, 2.058,
2.059 2.060:
http://postimage.org/image/hqn6l4l8p/
It's a great improvement. Thanks for the new release.
On 02-08-2012 21:48, Alex Rønne Petersen wrote:
On 02-08-2012 21:18, Walter Bright wrote:
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
http://ftp.digitalmars.com/dmd.1.075.zip
http://www.digitalmars.com/d/2.0/changelog.html
On Thu, Aug 2, 2012 at 10:18 PM, dnewbie r...@myopera.com wrote:
Memory usage of my program when compiled by dmd2.057, 2.058, 2.059 2.060:
http://postimage.org/image/hqn6l4l8p/
It's a great improvement. Thanks for the new release.
Wow.
Ok, got another regression. Quite a scary bug.
http://d.puremagic.com/issues/show_bug.cgi?id=8497
On 8/2/12, Philippe Sigaud philippe.sig...@gmail.com wrote:
On Thu, Aug 2, 2012 at 10:18 PM, dnewbie r...@myopera.com wrote:
Memory usage of my program when compiled by dmd2.057, 2.058, 2.059
2.060:
http://postimage.org/image/hqn6l4l8p/
It's a great improvement. Thanks for the new release.
On Thursday, 2 August 2012 at 20:38:11 UTC, Andrej Mitrovic wrote:
On 8/2/12, Philippe Sigaud philippe.sig...@gmail.com wrote:
On Thu, Aug 2, 2012 at 10:18 PM, dnewbie r...@myopera.com
wrote:
Memory usage of my program when compiled by dmd2.057, 2.058,
2.059
2.060:
Am Thu, 02 Aug 2012 12:18:37 -0700
schrieb Walter Bright newshou...@digitalmars.com:
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
http://ftp.digitalmars.com/dmd.1.075.zip
http://www.digitalmars.com/d/2.0/changelog.html
On 08/02/2012 09:28 PM, bearophile wrote:
Walter Bright:
Another big pile of bug fixes. More contributors than ever!
And there is the first step of this change too:
http://d.puremagic.com/issues/show_bug.cgi?id=6652
Bye,
bearophile
Which is the wrong thing to do. 'ref' means that the range
On 8/2/12 4:36 PM, Jacob Carlborg wrote:
On 2012-08-02 21:18, Walter Bright wrote:
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
http://ftp.digitalmars.com/dmd.1.075.zip
http://www.digitalmars.com/d/2.0/changelog.html
Walter Bright wrote:
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
http://ftp.digitalmars.com/dmd.1.075.zip
http://www.digitalmars.com/d/2.0/changelog.html
https://github.com/downloads/D-Programming-Language/dmd/dmd.2.060.zip
It's
On 02-08-2012 23:25, Walter Bright wrote:
On 8/2/2012 1:08 PM, Alex Rønne Petersen wrote:
Unfortunately ran into a couple of regressions (though nothing major).
Please join the beta program!
I usually do, but didn't really get the time to try it out this release.
--
Alex Rønne Petersen
On 8/2/2012 1:08 PM, Alex Rønne Petersen wrote:
Unfortunately ran into a couple of regressions (though nothing major).
Please join the beta program!
On 8/2/2012 1:46 PM, Marco Leise wrote:
By the way, it would be great if the bash completion script was also available
in the .zip distribution.
Please submit a pull request.
Am Thu, 02 Aug 2012 14:54:01 -0700
schrieb Walter Bright newshou...@digitalmars.com:
On 8/2/2012 1:46 PM, Marco Leise wrote:
By the way, it would be great if the bash completion script was also
available in the .zip distribution.
Please submit a pull request.
I'm not its maintainer.
On Thursday, August 02, 2012 12:18:37 Walter Bright wrote:
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
http://ftp.digitalmars.com/dmd.1.075.zip
http://www.digitalmars.com/d/2.0/changelog.html
On 8/2/12 6:46 PM, Jonathan M Davis wrote:
On Thursday, August 02, 2012 12:18:37 Walter Bright wrote:
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
http://ftp.digitalmars.com/dmd.1.075.zip
On Thursday, August 02, 2012 19:04:06 Andrei Alexandrescu wrote:
On 8/2/12 6:46 PM, Jonathan M Davis wrote:
On Thursday, August 02, 2012 12:18:37 Walter Bright wrote:
Another big pile of bug fixes. More contributors than ever!
http://www.digitalmars.com/d/1.0/changelog.html
On 8/2/12 7:15 PM, Jonathan M Davis wrote:
We need changelog.dd for the new/changes section (which is where the problem
is in this case), but your suggestion makes sense for the bugzilla section. It
requires someone doing the work though. I threw together a quick program to
grab the bugfix list
On 8/3/2012 4:50 AM, Andrej Mitrovic wrote:
On 8/2/12, Walter Bright newshou...@digitalmars.com wrote:
Known issue, it's an inevitable result (it never worked right anyway):
http://d.puremagic.com/issues/show_bug.cgi?id=8454
P.S. You might want to monitor the beta releases.
I've posted
On 8/3/2012 4:40 AM, Peter Alexander wrote:
Nice update, but broke Derelict2 :-(
Regression: delegates with default arguments are broken (worked in 2.059)
void foo(void delegate(int x = 0) fun)
{
fun(); // Error: expected 1 function arguments, not 0
}
I've committed the fix Derelict2.
Am Thu, 02 Aug 2012 19:23:15 -0700
schrieb Walter Bright newshou...@digitalmars.com:
The beauty of git is you don't have to be the maintainer. Anyone can submit
pull requests for any project.
I did use GitHub fork pull request before to fix a small bug in Phobos.
What I mean is, the person
On 2012-08-01 22:54, Andrej Mitrovic wrote:
I think many people viewed Unicode this way at first. But there is a
metric ton of cool info out there if you want to get to know more
about unicode (this may or may not be interesting reading material),
e.g.:
On 2012-08-01 22:47, Philippe Sigaud wrote:
I somehow thought that with UTF-8 you were limited to a part of
Unicode, and to another, bigger part with UTF-16.
I equated Unicode with UTF-32.
This is what completely warped my vision. It's good to learn something
new everyday, I guess.
Thanks
On 8/1/2012 10:53 PM, Jonathan M Davis wrote:
What I was expecting there to be was a type which was a range of tokens. You
passed the source string to a function which returned that range, and you
iterated over it to process each token. What you appear to have been arguing
for is another type
On 2012-08-01 22:10, Jonathan M Davis wrote:
It may very well be a good idea to templatize Token on range type. It would be
nice not to have to templatize it, but that may be the best route to go. The
main question is whether str is _always_ a slice (or the result of
takeExactly) of the orignal
On 8/1/2012 10:44 PM, Jonathan M Davis wrote:
On Wednesday, August 01, 2012 22:33:12 Walter Bright wrote:
The lexer must use char or it will not be acceptable as anything but a toy
for performance reasons.
Avoiding decoding can be done with strings and operating on ranges of dchar,
so you'd
On 2012-08-02 07:44, Jonathan M Davis wrote:
On Wednesday, August 01, 2012 22:33:12 Walter Bright wrote:
The lexer must use char or it will not be acceptable as anything but a toy
for performance reasons.
Avoiding decoding can be done with strings and operating on ranges of dchar,
so you'd be
On 2012-08-02 02:10, Walter Bright wrote:
1. It should accept as input an input range of UTF8. I feel it is a
mistake to templatize it for UTF16 and UTF32. Anyone desiring to feed it
UTF16 should use an 'adapter' range to convert the input to UTF8. (This
is what component programming is all
On Wednesday, August 01, 2012 21:52:15 Jonathan M Davis wrote:
And as much as there are potential performance issues with Phobos' choice of
treating strings as ranges of dchar, if it were to continue to treat them
as ranges of code units, it's pretty much a guarantee that there would be a
On 2012-08-02 08:39, Walter Bright wrote:
That's what I've been saying. So why have an input range of dchars,
which must be decoded in advance, otherwise it wouldn't be a range of
dchars?
Your first requirement is a bit strange and confusing:
1. It should accept as input an input range of
On Thursday, 2 August 2012 at 05:36:37 UTC, Walter Bright wrote:
Using a class implies an extra level of indirection, […]
Use pass-by-ref for the Token.
How is pass-by-ref not an extra level of indirection?
David
Jonathan M Davis , dans le message (digitalmars.D:173942), a écrit :
It may very well be a good idea to templatize Token on range type. It would
be
nice not to have to templatize it, but that may be the best route to go. The
main question is whether str is _always_ a slice (or the result of
http://i.imgur.com/oSXTc.png
Posted without comment.
On 2012-08-02 07:31, Jakob Ovrum wrote:
Which is exactly why I'm pointing out the current, poor approach. Having
a single array with contiguous Tokens for lookahead is completely doable
even when Token is a class with some simple GC.malloc and emplace
composition. I think SDC's Token class is
On Wednesday, August 01, 2012 23:39:42 Walter Bright wrote:
Somebody has to convert the input files into dchars, and then back into
chars. That blows for performance. Think billions and billions of
characters going through, not just a few random strings.
Why is there any converting to
On Thursday, 2 August 2012 at 07:11:36 UTC, Jacob Carlborg wrote:
If you change Token to a struct it takes 64bytes on a LP64
platform. I don't know if that is too big to be passed around
by value.
That's why I moved Token to a class in the first place.
It became far too big and you had to
On 8/1/2012 11:59 PM, Jacob Carlborg wrote:
Your first requirement is a bit strange and confusing:
1. It should accept as input an input range of UTF8.
To my understand ranges in Phobos are designed to operate on dchars, not chars,
regardless of the original input type. So if you can create a
On 2012-08-02 09:11, Jacob Carlborg wrote:
If you change Token to a struct it takes 64 bytes on a LP64 platform. I
don't know if that is too big to be passed around by value.
Just for comparison, the type used for tokens in Clang is only 24 bytes.
The main reason is the small source
In my dev work I've shaved some bytes off of Token.
I removed the filename from Location, as we don't assume
the input is a file anymore, and I've changed to tracking
line and column numbers as uint instead of size_t.
I don't know what kind of number I _should_ be aiming for,
but I'd imagine I'm
On 2012-08-02 09:21, Walter Bright wrote:
I answered this point a few posts up in the thread.
I've read a few posts up and the only answer I found is that the lexer
needs to operates on chars. But it does not answer the question how that
range type would be used by all other range based
On 8/1/2012 11:56 PM, Jonathan M Davis wrote:
Another thing that I should point out is that a range of UTF-8 or UTF-16
wouldn't work with many range-based functions at all. Most of std.algorithm
and its ilk would be completely useless. Range-based functions operate on a
ranges elements, so
On 8/1/2012 11:49 PM, Jacob Carlborg wrote:
On 2012-08-02 02:10, Walter Bright wrote:
1. It should accept as input an input range of UTF8. I feel it is a
mistake to templatize it for UTF16 and UTF32. Anyone desiring to feed it
UTF16 should use an 'adapter' range to convert the input to UTF8.
On Thursday, 2 August 2012 at 07:29:52 UTC, Walter Bright wrote:
The lexer MUST MUST MUST be FAST FAST FAST. Or it will not be
useful. If it isn't fast, serious users will eschew it and will
cook up their own. You'll have a nice, pretty, useless toy of
std.d.lexer.
If you want to throw out
On Thursday, 2 August 2012 at 07:32:59 UTC, Walter Bright wrote:
On 8/2/2012 12:09 AM, Bernard Helyer wrote:
http://i.imgur.com/oSXTc.png
Posted without comment.
I'm afraid I'm mystified as to your point.
Just that I'm slaving away over a hot IDE here. :P
On 8/2/2012 12:09 AM, Bernard Helyer wrote:
http://i.imgur.com/oSXTc.png
Posted without comment.
I'm afraid I'm mystified as to your point.
On Thursday, August 02, 2012 07:06:25 Christophe Travert wrote:
Jonathan M Davis , dans le message (digitalmars.D:173942), a écrit :
It may very well be a good idea to templatize Token on range type. It
would be nice not to have to templatize it, but that may be the best
route to go. The
On Thursday, August 02, 2012 00:29:09 Walter Bright wrote:
If we want to be able to operate on ranges of UTF-8 or UTF-16, we need to
add a concept of variably-length encoded ranges so that it's possible to
treat them as both their encoding and whatever they represent (e.g. code
point or
On 2012-08-02 09:26, Bernard Helyer wrote:
In my dev work I've shaved some bytes off of Token.
I removed the filename from Location, as we don't assume
the input is a file anymore, and I've changed to tracking
line and column numbers as uint instead of size_t.
I don't know what kind of number I
On Thursday, 2 August 2012 at 07:42:05 UTC, Jacob Carlborg wrote:
You can probably shave off a couple of bytes by using a
(u)short or (u)byte instead of TokenKind. The TokenKind takes
32 bits, that's way more then what's actually needed.
Good point. I think there's 180 ish at the moment, so
On 2012-08-02 09:29, Walter Bright wrote:
My experience in writing fast string based code that worked on UTF8 and
correctly handled multibyte characters was that they are very possible
and practical, and they are faster.
The lexer MUST MUST MUST be FAST FAST FAST. Or it will not be useful. If
10. High speed matters a lot
then add a benchmark suite to the list - the lexer should be
benchmarked from the very first beginning
and it should be designed for multithreading - there is no need for
on-the-fly hash-table updating - maybe just one update on each lex
threads end
On 8/2/2012 12:33 AM, Bernard Helyer wrote:
On Thursday, 2 August 2012 at 07:29:52 UTC, Walter Bright wrote:
The lexer MUST MUST MUST be FAST FAST FAST. Or it will not be useful. If it
isn't fast, serious users will eschew it and will cook up their own. You'll
have a nice, pretty, useless toy
On 8/2/2012 12:49 AM, Jacob Carlborg wrote:
But what I still don't understand is how a UTF-8 range is going to be usable by
other range based functions in Phobos.
Worst case use an adapter range.
On 8/2/2012 12:43 AM, Jonathan M Davis wrote:
It is for ranges in general. In the general case, a range of UTF-8 or UTF-16
makes no sense whatsoever. Having range-based functions which understand the
encodings and optimize accordingly can be very beneficial (which happens with
strings but can't
On 8/2/2012 12:29 AM, Jacob Carlborg wrote:
On 2012-08-02 09:21, Walter Bright wrote:
I answered this point a few posts up in the thread.
I've read a few posts up and the only answer I found is that the lexer needs to
operates on chars. But it does not answer the question how that range type
On Thursday, August 02, 2012 01:13:04 Walter Bright wrote:
On 8/2/2012 12:33 AM, Bernard Helyer wrote:
On Thursday, 2 August 2012 at 07:29:52 UTC, Walter Bright wrote:
The lexer MUST MUST MUST be FAST FAST FAST. Or it will not be useful. If
it
isn't fast, serious users will eschew it and
Am 02.08.2012 10:13, schrieb Walter Bright:
On 8/2/2012 12:33 AM, Bernard Helyer wrote:
On Thursday, 2 August 2012 at 07:29:52 UTC, Walter Bright wrote:
The lexer MUST MUST MUST be FAST FAST FAST. Or it will not be useful. If it
isn't fast, serious users will eschew it and will cook up their
Am 02.08.2012 10:13, schrieb Walter Bright:
On 8/2/2012 12:33 AM, Bernard Helyer wrote:
On Thursday, 2 August 2012 at 07:29:52 UTC, Walter Bright wrote:
The lexer MUST MUST MUST be FAST FAST FAST. Or it will not be useful. If it
isn't fast, serious users will eschew it and will cook up their
On Thursday, August 02, 2012 01:14:30 Walter Bright wrote:
On 8/2/2012 12:43 AM, Jonathan M Davis wrote:
It is for ranges in general. In the general case, a range of UTF-8 or
UTF-16 makes no sense whatsoever. Having range-based functions which
understand the encodings and optimize
On 8/2/2012 12:21 AM, Jonathan M Davis wrote:
Because your input range is a range of dchar?
I think that we're misunderstanding each other here. A typical, well-written,
range-based function which operates on ranges of dchar will use static if or
overloads to special-case strings. This means
On 8/2/2012 1:21 AM, Jonathan M Davis wrote:
How would we measure that? dmd's lexer is tied to dmd, so how would we test
the speed of only its lexer?
Easy. Just make a special version of dmd that lexes only, and time it.
On 8/2/2012 1:38 AM, Jonathan M Davis wrote:
On Thursday, August 02, 2012 01:14:30 Walter Bright wrote:
On 8/2/2012 12:43 AM, Jonathan M Davis wrote:
It is for ranges in general. In the general case, a range of UTF-8 or
UTF-16 makes no sense whatsoever. Having range-based functions which
On Thursday, 2 August 2012 at 05:36:37 UTC, Walter Bright wrote:
Using a class implies an extra level of indirection, and the
other issue is the only point to using a class is if you're
going to derive from it and override its methods. I don't see
that for a Token.
Use pass-by-ref for the
On 8/2/2012 12:34 AM, Bernard Helyer wrote:
Just that I'm slaving away over a hot IDE here. :P
Ah! Well, keep on keepin' on, then!
Walter Bright , dans le message (digitalmars.D:174015), a écrit :
On 8/2/2012 12:49 AM, Jacob Carlborg wrote:
But what I still don't understand is how a UTF-8 range is going to be usable
by
other range based functions in Phobos.
Worst case use an adapter range.
Yes
auto r =
Walter Bright wrote:
1. It should accept as input an input range of UTF8. I feel it is a
mistake to templatize it for UTF16 and UTF32. Anyone desiring to feed it
UTF16 should use an 'adapter' range to convert the input to UTF8. (This
is what component programming is all about.)
Why it is a
On 8/2/2012 2:27 AM, Piotr Szturmaj wrote:
Walter Bright wrote:
1. It should accept as input an input range of UTF8. I feel it is a
mistake to templatize it for UTF16 and UTF32. Anyone desiring to feed it
UTF16 should use an 'adapter' range to convert the input to UTF8. (This
is what component
Le 02/08/2012 07:35, Walter Bright a écrit :
Using a class implies an extra level of indirection, and the other issue
is the only point to using a class is if you're going to derive from it
and override its methods. I don't see that for a Token.
Use pass-by-ref for the Token.
The fact that
Le 02/08/2012 06:48, Walter Bright a écrit :
On 8/1/2012 9:41 PM, H. S. Teoh wrote:
Whether it's part of the range type or a separate lexer type,
*definitely* make it possible to have multiple instances. One of the
biggest flaws of otherwise-good lexer generators like lex and flex
(C/C++) is
7. It should accept a callback delegate for errors. That delegate should
decide whether to:
1. ignore the error (and Lexer will try to recover and continue)
2. print an error message (and Lexer will try to recover and continue)
3. throw an exception, Lexer is done with that input range
On 08/02/12 05:41, Jonathan M Davis wrote:
On Thursday, August 02, 2012 05:34:37 cal wrote:
Just wanted to point out that for a while now when you type dlang
into google, the summary that google puts up starts with the
latin from the input to the sample code on the main page:
Standard input.
Le 02/08/2012 10:13, Walter Bright a écrit :
On 8/2/2012 12:33 AM, Bernard Helyer wrote:
On Thursday, 2 August 2012 at 07:29:52 UTC, Walter Bright wrote:
The lexer MUST MUST MUST be FAST FAST FAST. Or it will not be useful.
If it
isn't fast, serious users will eschew it and will cook up their
Le 02/08/2012 10:44, Walter Bright a écrit :
On 8/2/2012 1:38 AM, Jonathan M Davis wrote:
On Thursday, August 02, 2012 01:14:30 Walter Bright wrote:
On 8/2/2012 12:43 AM, Jonathan M Davis wrote:
It is for ranges in general. In the general case, a range of UTF-8 or
UTF-16 makes no sense
Le 02/08/2012 09:30, Walter Bright a écrit :
On 8/1/2012 11:49 PM, Jacob Carlborg wrote:
On 2012-08-02 02:10, Walter Bright wrote:
1. It should accept as input an input range of UTF8. I feel it is a
mistake to templatize it for UTF16 and UTF32. Anyone desiring to feed it
UTF16 should use an
On Thursday, 2 August 2012 at 11:47:20 UTC, deadalnix wrote:
lexer really isn't the performance bottleneck of dmd (or any
compiler of a non trivial language).
What if we're just using this lexer in something like a
syntax highlighting text editor? I'd be annoyed if it
stopped typing for a
On 8/2/12 6:07 AM, Walter Bright wrote:
Why? I've never seen any UTF16 or UTF32 D source in the wild.
Here's a crazy idea that I'll hang to this one remark. No, two crazy ideas.
First, after having read the large back-and-forth Jonathan/Walter in one
sitting, it's becoming obvious to me
On 02-Aug-12 12:44, Walter Bright wrote:
On 8/2/2012 1:38 AM, Jonathan M Davis wrote:
On Thursday, August 02, 2012 01:14:30 Walter Bright wrote:
On 8/2/2012 12:43 AM, Jonathan M Davis wrote:
It is for ranges in general. In the general case, a range of UTF-8 or
UTF-16 makes no sense
Intrigued by a familiar topic in std.lexer. I've split it out.
It's not as easy question as it seems.
Before you start the usual because codepoint has semantic meaning,
codeunit is just bytes ya-da, ya-da let me explain you something.
Codepoint is indeed a complete piece of symbolic
On 8/2/2012 4:49 AM, deadalnix wrote:
How is that different than a manually done range of dchar ?
The decoding is rarely necessary, even if non-ascii data is there. However, the
range cannot decide if decoding is necessary - the receiver has to, hence the
receiver does the decoding.
On 8/2/2012 8:46 AM, Dmitry Olshansky wrote:
Keep a 6 character buffer in your consumer. If you read a char with the
high bit set, start filling that buffer and then decode it.
4 bytes is enough.
Since Unicode 5(?) the range of codepoints was defined to be 0...0x10
specifically so that it
On 8/2/2012 4:27 AM, deadalnix wrote:
Really nice idea. It is still easy to wrap the Range in another Range that
process errors in a custom way.
It's suboptimal for a high speed lexer/parser, as already explained.
Speed is of paramount importance for a lexer, and it overrides things like
On 8/2/2012 4:47 AM, deadalnix wrote:
Le 02/08/2012 10:13, Walter Bright a écrit :
As fast as the dmd one would be best.
That'd be great but . . .
lexer really isn't the performance bottleneck of dmd (or any compiler of a non
trivial language). Additionally, anybody that have touched dmd
On 8/2/2012 4:52 AM, deadalnix wrote:
Le 02/08/2012 09:30, Walter Bright a écrit :
On 8/1/2012 11:49 PM, Jacob Carlborg wrote:
On 2012-08-02 02:10, Walter Bright wrote:
1. It should accept as input an input range of UTF8. I feel it is a
mistake to templatize it for UTF16 and UTF32. Anyone
Am Thu, 02 Aug 2012 14:26:58 +0200
schrieb Adam D. Ruppe destructiona...@gmail.com:
On Thursday, 2 August 2012 at 11:47:20 UTC, deadalnix wrote:
lexer really isn't the performance bottleneck of dmd (or any
compiler of a non trivial language).
What if we're just using this lexer in
1 - 100 of 234 matches
Mail list logo