Re: std.d.lexer requirements

2012-08-07 Thread Jonathan M Davis
On Tuesday, August 07, 2012 14:30:51 Walter Bright wrote: > On 8/7/2012 2:14 PM, Jonathan M Davis wrote: > > I expect that the configuration stuff is going to have to be adjusted > > after I'm done, since I'm not sure that it's entirely clear what's worth > > configuring or not. > > "When in doubt

Re: std.d.lexer requirements

2012-08-07 Thread Christophe Travert
Jacob Carlborg , dans le message (digitalmars.D:174421), a écrit : > On 2012-08-07 12:06, Jonathan M Davis wrote: > >> It's easier to see where in the range of tokens the errors occur. A delegate >> is disconnected from the point where the range is being consumed, whereas if >> tokens are used for

Re: std.d.lexer requirements

2012-08-07 Thread Walter Bright
On 8/7/2012 2:14 PM, Jonathan M Davis wrote: I expect that the configuration stuff is going to have to be adjusted after I'm done, since I'm not sure that it's entirely clear what's worth configuring or not. "When in doubt, leave it out." If experience later shows it is really needed, it is ea

Re: std.d.lexer requirements

2012-08-07 Thread Jonathan M Davis
On Tuesday, August 07, 2012 12:38:26 Walter Bright wrote: > Yes, I understand that. There's also a point about adding too much > complexity to the interface. The delegate callback reduces complexity in > the interface. It doesn't really affect much to allow choosing between returning a token and

Re: std.d.lexer requirements

2012-08-07 Thread Philippe Sigaud
On Tue, Aug 7, 2012 at 9:38 PM, Walter Bright wrote: > Yes, I understand about static if decisions :-) hell I invented them! And what a wonderful decision that was! > Yes, I understand that. There's also a point about adding too much > complexity to the interface. The delegate callback reduces

Re: std.d.lexer requirements

2012-08-07 Thread Walter Bright
On 8/7/2012 7:15 AM, Philippe Sigaud wrote: Also, what I proposed was a *static* decision: with SkipErrors { no, yes }. With a static if inside its guts, the lexer could change its behavior accordingly. Yes, I understand about static if decisions :-) hell I invented them! Walter, with all du

Re: std.d.lexer requirements

2012-08-07 Thread Walter Bright
On 8/7/2012 3:06 AM, Jonathan M Davis wrote: It's easier to see where in the range of tokens the errors occur. A delegate is disconnected from the point where the range is being consumed, whereas if tokens are used for errors, then the function consuming the range can see exactly where in the ran

Re: std.d.lexer requirements

2012-08-07 Thread Jacob Carlborg
On 2012-08-07 12:06, Jonathan M Davis wrote: It's easier to see where in the range of tokens the errors occur. A delegate is disconnected from the point where the range is being consumed, whereas if tokens are used for errors, then the function consuming the range can see exactly where in the ra

Re: std.d.lexer requirements

2012-08-07 Thread Philippe Sigaud
On Tue, Aug 7, 2012 at 12:06 PM, Jonathan M Davis wrote: > Regardless, I was asked to keep that option in there by at least one person > (Philippe Sigaud IIRC), which is why I didn't just switch over to the delegate > entirely. IIRC, I was not the only one, as people here interested in coding an

Re: std.d.lexer requirements

2012-08-07 Thread Christophe Travert
Walter Bright , dans le message (digitalmars.D:174394), a écrit : > On 8/7/2012 1:14 AM, Jonathan M Davis wrote: >> But you can also configure the lexer to return an error token instead of >> using >> the delegate if that's what you prefer. But Walter is right in that if you >> have to check every

Re: std.d.lexer requirements

2012-08-07 Thread Christophe Travert
Walter Bright , dans le message (digitalmars.D:174393), a écrit : > If the delegate returns, then the lexer recovers. That's an option, if there is only one way to recover (which is a reasonable assumption). You wanted the delegate to "decide what to do with the errors (ignore, throw exception,

Re: std.d.lexer requirements

2012-08-07 Thread Jonathan M Davis
On Tuesday, August 07, 2012 02:54:42 Walter Bright wrote: > On 8/7/2012 1:14 AM, Jonathan M Davis wrote: > > But you can also configure the lexer to return an error token instead of > > using the delegate if that's what you prefer. But Walter is right in that > > if you have to check every token fo

Re: std.d.lexer requirements

2012-08-07 Thread Walter Bright
On 8/7/2012 1:14 AM, Jonathan M Davis wrote: But you can also configure the lexer to return an error token instead of using the delegate if that's what you prefer. But Walter is right in that if you have to check every token for whether it's an error, that will incur overhead. So, depending on yo

Re: std.d.lexer requirements

2012-08-07 Thread Walter Bright
On 8/7/2012 1:00 AM, Christophe Travert wrote: That's why I suggested supplying a callback delegate to decide what to do with errors (ignore, throw exception, or quit) and have the delegate itself do that. That way, there is no customization of the Lexer required. It may be easier to take into

Re: std.d.lexer requirements

2012-08-07 Thread Walter Bright
On 8/6/2012 5:14 PM, Jason House wrote: The following is an incredibly fast multithreaded hash table. It is both lock-free and fence-free. Would something like that solve your problem? http://www.azulsystems.com/events/javaone_2007/2007_LockFreeHash.pdf It might if I understood it! There do s

Re: std.d.lexer requirements

2012-08-07 Thread Jonathan M Davis
On Tuesday, August 07, 2012 08:00:24 Christophe Travert wrote: > Walter Bright , dans le message (digitalmars.D:174360), a écrit : > > That's why I suggested supplying a callback delegate to decide what to do > > with errors (ignore, throw exception, or quit) and have the delegate > > itself do th

Re: std.d.lexer requirements

2012-08-07 Thread Christophe Travert
Walter Bright , dans le message (digitalmars.D:174360), a écrit : > On 8/6/2012 12:00 PM, Philippe Sigaud wrote: >> Yes, well we don't have a condition system. And using exceptions >> during lexing would most probably kill its efficiency. >> Errors in lexing are not uncommon. The usual D idiom of h

Re: std.d.lexer requirements

2012-08-06 Thread Jason House
On Thursday, 2 August 2012 at 04:48:56 UTC, Walter Bright wrote: On 8/1/2012 9:41 PM, H. S. Teoh wrote: Whether it's part of the range type or a separate lexer type, *definitely* make it possible to have multiple instances. One of the biggest flaws of otherwise-good lexer generators like lex an

Re: std.d.lexer requirements

2012-08-06 Thread Dmitry Olshansky
On 07-Aug-12 01:48, Jacob Carlborg wrote: On 2012-08-06 22:26, Dmitry Olshansky wrote: No. And doing Tokens as special comment token is frankly bad idea. See Walter's comments in this thread. Also e.g. For compiler only DDoc ones are ever useful, not so for IDE. Filtering them out later is in

Re: std.d.lexer requirements

2012-08-06 Thread Jacob Carlborg
On 2012-08-06 22:26, Dmitry Olshansky wrote: No. And doing Tokens as special comment token is frankly bad idea. See Walter's comments in this thread. Also e.g. For compiler only DDoc ones are ever useful, not so for IDE. Filtering them out later is inefficient, as it would be far better not to

Re: std.d.lexer requirements

2012-08-06 Thread Walter Bright
On 8/6/2012 12:00 PM, Philippe Sigaud wrote: Yes, well we don't have a condition system. And using exceptions during lexing would most probably kill its efficiency. Errors in lexing are not uncommon. The usual D idiom of having an enum StopOnError { no, yes } should be enough. That's why I sug

Re: std.d.lexer requirements

2012-08-06 Thread Dmitry Olshansky
On 06-Aug-12 22:03, deadalnix wrote: Le 04/08/2012 15:45, Dmitry Olshansky a écrit : On 04-Aug-12 15:48, Jonathan M Davis wrote: On Saturday, August 04, 2012 15:32:22 Dmitry Olshansky wrote: I see it as a compile-time policy, that will fit nicely and solve both issues. Just provide a templates

Re: std.d.lexer requirements

2012-08-06 Thread Jacob Carlborg
On 2012-08-06 21:00, Philippe Sigaud wrote: Yes, well we don't have a condition system. And using exceptions during lexing would most probably kill its efficiency. Errors in lexing are not uncommon. The usual D idiom of having an enum StopOnError { no, yes } should be enough. Especially when i

Re: std.d.lexer requirements

2012-08-06 Thread Philippe Sigaud
On Mon, Aug 6, 2012 at 8:03 PM, deadalnix wrote: > The most complex thing that is needed is the policy to allocate identifiers > in tokens. It can be made by passing a function that have a string as > parameter and a string as return value. The default one would be an identity > function. I thin

Re: std.d.lexer requirements

2012-08-06 Thread deadalnix
Le 04/08/2012 15:45, Dmitry Olshansky a écrit : On 04-Aug-12 15:48, Jonathan M Davis wrote: On Saturday, August 04, 2012 15:32:22 Dmitry Olshansky wrote: I see it as a compile-time policy, that will fit nicely and solve both issues. Just provide a templates with a few hooks, and add a Noop poli

Re: std.d.lexer requirements

2012-08-06 Thread Ary Manzana
On 8/1/12 21:10 , Walter Bright wrote: 8. Lexer should be configurable as to whether it should collect information about comments and ddoc comments or not 9. Comments and ddoc comments should be attached to the next following token, they should not themselves be tokens I believe there should b

Re: std.d.lexer requirements

2012-08-05 Thread Walter Bright
On 8/5/2012 12:59 AM, Brad Roberts wrote: To help with performance comparisons I ripped dmd's lexer out and got it building as a few .d files. It's very crude. It's got tons of casts (more than the original c++ version). I attempted no cleanup or any other change than the minimum I could to g

Re: std.d.lexer requirements

2012-08-05 Thread Jonathan M Davis
On Saturday, August 04, 2012 17:45:58 Dmitry Olshansky wrote: > On 04-Aug-12 15:48, Jonathan M Davis wrote: > > On Saturday, August 04, 2012 15:32:22 Dmitry Olshansky wrote: > >> I see it as a compile-time policy, that will fit nicely and solve both > >> issues. Just provide a templates with a few

Re: std.d.lexer requirements

2012-08-05 Thread Brad Roberts
To help with performance comparisons I ripped dmd's lexer out and got it building as a few .d files. It's very crude. It's got tons of casts (more than the original c++ version). I attempted no cleanup or any other change than the minimum I could to get it to build and run. Obviously there's t

Re: std.d.lexer requirements

2012-08-04 Thread Chad J
On 08/02/2012 03:09 AM, Bernard Helyer wrote: http://i.imgur.com/oSXTc.png Posted without comment. Hell yeah Alexander Brandon.

Re: std.d.lexer requirements

2012-08-04 Thread Christophe Travert
Jonathan M Davis , dans le message (digitalmars.D:174223), a écrit : > On Saturday, August 04, 2012 15:32:22 Dmitry Olshansky wrote: >> I see it as a compile-time policy, that will fit nicely and solve both >> issues. Just provide a templates with a few hooks, and add a Noop policy >> that does not

Re: std.d.lexer requirements

2012-08-04 Thread Dmitry Olshansky
On 04-Aug-12 15:48, Jonathan M Davis wrote: On Saturday, August 04, 2012 15:32:22 Dmitry Olshansky wrote: I see it as a compile-time policy, that will fit nicely and solve both issues. Just provide a templates with a few hooks, and add a Noop policy that does nothing. It's starting to look lik

Re: std.d.lexer requirements

2012-08-04 Thread Tobias Pankrath
On Saturday, 4 August 2012 at 11:58:09 UTC, Jonathan M Davis wrote: On Saturday, August 04, 2012 15:32:22 Dmitry Olshansky wrote: I see it as a compile-time policy, that will fit nicely and solve both issues. Just provide a templates with a few hooks, and add a Noop policy that does nothing.

Re: std.d.lexer requirements

2012-08-04 Thread Jonathan M Davis
On Saturday, August 04, 2012 15:32:22 Dmitry Olshansky wrote: > I see it as a compile-time policy, that will fit nicely and solve both > issues. Just provide a templates with a few hooks, and add a Noop policy > that does nothing. It's starting to look like figuring out what should and shouldn't b

Re: std.d.lexer requirements

2012-08-04 Thread Dmitry Olshansky
On 04-Aug-12 14:55, Christophe Travert wrote: Dmitry Olshansky , dans le message (digitalmars.D:174214), a écrit : Most likely - since you re-read the same memory twice to do it. You're probably right, but if you do this right after the token is generated, the memory should still be near the p

Re: std.d.lexer requirements

2012-08-04 Thread Christophe Travert
Dmitry Olshansky , dans le message (digitalmars.D:174214), a écrit : > Most likely - since you re-read the same memory twice to do it. You're probably right, but if you do this right after the token is generated, the memory should still be near the processor. And the operation on the first read

Re: std.d.lexer requirements

2012-08-04 Thread Christophe Travert
Jonathan M Davis , dans le message (digitalmars.D:174191), a écrit : > On Thursday, August 02, 2012 11:08:23 Walter Bright wrote: >> The tokens are not kept, correct. But the identifier strings, and the string >> literals, are kept, and if they are slices into the input buffer, then >> everything I

Re: std.d.lexer requirements

2012-08-04 Thread Dmitry Olshansky
On 04-Aug-12 14:02, Christophe Travert wrote: Jonathan M Davis , dans le message (digitalmars.D:174191), a écrit : On Thursday, August 02, 2012 11:08:23 Walter Bright wrote: The tokens are not kept, correct. But the identifier strings, and the string literals, are kept, and if they are slices i

Re: std.d.lexer requirements

2012-08-04 Thread deadalnix
Le 03/08/2012 21:59, Walter Bright a écrit : On 8/3/2012 6:18 AM, deadalnix wrote: lexer can have a parameter that tell if it should build a table of token or slice the input. The second is important, for instance for an IDE : lexing will occur often, and you prefer slicing here because you alre

Re: std.d.lexer requirements

2012-08-03 Thread Jonathan M Davis
On Thursday, August 02, 2012 11:08:23 Walter Bright wrote: > The tokens are not kept, correct. But the identifier strings, and the string > literals, are kept, and if they are slices into the input buffer, then > everything I said applies. String literals often _can't_ be slices unless you leave t

Re: std.d.lexer requirements

2012-08-03 Thread Walter Bright
On 8/3/2012 6:18 AM, deadalnix wrote: lexer can have a parameter that tell if it should build a table of token or slice the input. The second is important, for instance for an IDE : lexing will occur often, and you prefer slicing here because you already have the source file in memory anyway. A

Re: std.d.lexer requirements

2012-08-03 Thread Walter Bright
On 8/3/2012 4:40 AM, Tobias Pankrath wrote: Would this be an argument for putting the computation of source locations (i.e. line + offset or similar) into the range / into an template argument / policy, so that it's done in the most effective way for the client? Kate for example has a "range"-ty

Re: std.d.lexer requirements

2012-08-03 Thread Jacob Carlborg
On 2012-08-03 18:49, Dmitry Olshansky wrote: Draw thing to an off screen bitmap then blit it to window (aye, pass back to UI thread a reference to the buffer with pixels). This technique been in use for decades. Imagine drawing some large intricate fractal it could easily take few seconds. O

Re: std.d.lexer requirements

2012-08-03 Thread Dmitry Olshansky
On 03-Aug-12 10:35, Jacob Carlborg wrote: On 2012-08-03 00:25, Dmitry Olshansky wrote: OT: It never ceases to amaze me how people miss this very simple point: GUI runs on its own thread and shouldn't ever block on something (save for message pump itself, of course). Everything else (including p

Re: std.d.lexer requirements

2012-08-03 Thread Tobias Pankrath
On Friday, 3 August 2012 at 14:49:55 UTC, trav...@phare.normalesup.org If I may add, there are several possitilities here: 1- a real slice of the input range 2- a slice of the input range created with .save and takeExactly 3- a slice allocated in GC memory by the lexer 4- a slice of memory

Re: std.d.lexer requirements

2012-08-03 Thread Christophe Travert
deadalnix , dans le message (digitalmars.D:174155), a écrit : >> The tokens are not kept, correct. But the identifier strings, and the >> string literals, are kept, and if they are slices into the input buffer, >> then everything I said applies. >> > > Ok, what do you think of that : > > lexer ca

Re: std.d.lexer requirements

2012-08-03 Thread deadalnix
Le 02/08/2012 20:14, Marco Leise a écrit : Am Thu, 02 Aug 2012 14:26:58 +0200 schrieb "Adam D. Ruppe": On Thursday, 2 August 2012 at 11:47:20 UTC, deadalnix wrote: lexer really isn't the performance bottleneck of dmd (or any compiler of a non trivial language). What if we're just using this

Re: std.d.lexer requirements

2012-08-03 Thread deadalnix
Le 03/08/2012 05:41, Andrei Alexandrescu a écrit : On 8/2/12 11:08 PM, Jonathan M Davis wrote: You're not going to get as fast a lexer if it's not written specifically for D. Writing a generic lexer is a different problem. It's also one that needs to be solved, but I think that it's a mistake to

Re: std.d.lexer requirements

2012-08-03 Thread deadalnix
Le 02/08/2012 20:08, Walter Bright a écrit : On 8/2/2012 4:52 AM, deadalnix wrote: Le 02/08/2012 09:30, Walter Bright a écrit : On 8/1/2012 11:49 PM, Jacob Carlborg wrote: On 2012-08-02 02:10, Walter Bright wrote: 1. It should accept as input an input range of UTF8. I feel it is a mistake to

Re: std.d.lexer requirements

2012-08-03 Thread Tobias Pankrath
Correct, that's the whole point of using a range - it can come from anything. For example, let's suppose we want to do D syntax highlighting in our IDE. It is highly unlikely that the text editor's data structure is a simple string. It's likely to be an array of lines, or something like that.

Re: std.d.lexer requirements

2012-08-03 Thread Walter Bright
On 8/3/2012 3:07 AM, Walter Bright wrote: On 8/3/2012 1:21 AM, Christophe Travert wrote: This range does not have to be a string, it can be a something over a file, stream, socket. It can also be the result of an algorithm, because you *can* use algorithm on ranges of char, and it makes sense if

Re: std.d.lexer requirements

2012-08-03 Thread Walter Bright
On 8/3/2012 1:21 AM, Christophe Travert wrote: This range does not have to be a string, it can be a something over a file, stream, socket. It can also be the result of an algorithm, because you *can* use algorithm on ranges of char, and it makes sense if you know what you are doing. Correct, th

Re: std.d.lexer requirements

2012-08-03 Thread Walter Bright
On 8/3/2012 2:01 AM, Jacob Carlborg wrote: On 2012-08-03 09:35, Walter Bright wrote: Look in doc.c at highlightCode2() for how to call the lexer by itself. So basically: Token tok; //start timer while (tok.value != TOKeof) lex.scan(&tok); //end timer Something like that? Pretty m

Re: std.d.lexer requirements

2012-08-03 Thread Jacob Carlborg
On 2012-08-03 09:35, Walter Bright wrote: Look in doc.c at highlightCode2() for how to call the lexer by itself. So basically: Token tok; //start timer while (tok.value != TOKeof) lex.scan(&tok); //end timer Something like that? -- /Jacob Carlborg

Re: std.d.lexer requirements

2012-08-03 Thread Walter Bright
On 8/3/2012 1:27 AM, dennis luehring wrote: Walter mentioned an easier way - without "forking" dmd - should be better an integral part of the ongoing development maybe under a tools or benchmark section? Forking it is fine. This is just for a one-off benchmarking thing.

Re: std.d.lexer requirements

2012-08-03 Thread dennis luehring
Am 03.08.2012 09:56, schrieb Ed McCardell: On 08/02/2012 04:41 AM, Walter Bright wrote: On 8/2/2012 1:21 AM, Jonathan M Davis wrote: How would we measure that? dmd's lexer is tied to dmd, so how would we test the speed of only its lexer? Easy. Just make a special version of dmd that lexes onl

Re: std.d.lexer requirements

2012-08-03 Thread Christophe Travert
Jacob Carlborg , dans le message (digitalmars.D:174131), a écrit : > static if(isNarrowString!R) > Unqual!(ElementEncodingType!R) first = range[0]; > else > dchar first = range.front; I find it more comfortable to just use first = range.front, with a range of char or ubyte. This range d

Re: std.d.lexer requirements

2012-08-03 Thread Ed McCardell
On 08/02/2012 04:41 AM, Walter Bright wrote: On 8/2/2012 1:21 AM, Jonathan M Davis wrote: How would we measure that? dmd's lexer is tied to dmd, so how would we test the speed of only its lexer? Easy. Just make a special version of dmd that lexes only, and time it. I made a lexing-only versi

Re: std.d.lexer requirements

2012-08-03 Thread Walter Bright
On 8/3/2012 12:11 AM, Jacob Carlborg wrote: On 2012-08-03 08:59, Walter Bright wrote: You don't need to extract it to measure it. Just have it lex the source files in a loop, and time that loop. Well, that's the problem. It's not like DMD has a single "lex" function that does all the job. Wo

Re: std.d.lexer requirements

2012-08-03 Thread Jacob Carlborg
On 2012-08-03 08:59, Walter Bright wrote: You don't need to extract it to measure it. Just have it lex the source files in a loop, and time that loop. Well, that's the problem. It's not like DMD has a single "lex" function that does all the job. Would it perhaps be possible to time Parser::

Re: std.d.lexer requirements

2012-08-03 Thread Walter Bright
On 8/2/2012 11:37 PM, Jacob Carlborg wrote: I'm not sure how easy it would be to just measure the lexing phase of DMD. If it's easy someone would probably already have extracted the lexer from DMD. You don't need to extract it to measure it. Just have it lex the source files in a loop, and ti

Re: std.d.lexer requirements

2012-08-03 Thread Jacob Carlborg
On 2012-08-03 08:49, dennis luehring wrote: wouldn't it be better to extract the lexer part of dmd into its own (hopefully small) library - that way the lexer is still useable by dmd AND benchmarkable from outside - it is then even possible to replace the dmd lexer by an D version due to the c l

Re: std.d.lexer requirements

2012-08-03 Thread Jacob Carlborg
On 2012-08-02 22:51, Christophe Travert wrote: Jacob Carlborg , dans le message (digitalmars.D:174069), a écrit : On 2012-08-02 10:15, Walter Bright wrote: Worst case use an adapter range. And that is better than a plain string? because its front method does not do any decoding. If it wa

Re: std.d.lexer requirements

2012-08-02 Thread dennis luehring
Am 03.08.2012 08:37, schrieb Jacob Carlborg: On 2012-08-03 00:01, Walter Bright wrote: But we do have the DMD lexer which is useful as a benchmark and a guide. I won't say it couldn't be made faster, but it does set a minimum bar for performance. I'm not sure how easy it would be to just meas

Re: std.d.lexer requirements

2012-08-02 Thread Jakob Ovrum
On Friday, 3 August 2012 at 04:02:34 UTC, Bernard Helyer wrote: I'll let you get on with it then. I'll amuse myself with the thought of someone asking why SDC doesn't use std.d.lexer or a parser generator. I'll then hit them with my cane, and tell them to get off of my lawn. :P I don't think yo

Re: std.d.lexer requirements

2012-08-02 Thread Jacob Carlborg
On 2012-08-03 00:01, Walter Bright wrote: But we do have the DMD lexer which is useful as a benchmark and a guide. I won't say it couldn't be made faster, but it does set a minimum bar for performance. I'm not sure how easy it would be to just measure the lexing phase of DMD. If it's easy som

Re: std.d.lexer requirements

2012-08-02 Thread Jacob Carlborg
On 2012-08-03 00:25, Dmitry Olshansky wrote: OT: It never ceases to amaze me how people miss this very simple point: GUI runs on its own thread and shouldn't ever block on something (save for message pump itself, of course). Everything else (including possibly slow rendering) done on the side an

Re: std.d.lexer requirements

2012-08-02 Thread Jacob Carlborg
On 2012-08-03 00:10, Walter Bright wrote: The rendering code should be in yet a third thread. Most GUI systems are not thread safe. I know for sure that Cocoa on Mac OS X is not. All the changes to the GUI needs to happen in the same thread. But you can usually post a message from another th

Re: std.d.lexer requirements

2012-08-02 Thread Jacob Carlborg
On 2012-08-02 22:54, Andrej Mitrovic wrote: It can do that immediately for the text that's visible in the window because ~100 lines of text can be lexed pretty damn instantly. As soon as that's done the GUI should be responsive and the rest of the text buffer should be lexed in the background.

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Friday, August 03, 2012 07:30:32 Philippe Sigaud wrote: > Look, many of us here were interested in your idea of having comments > and errors lexed as tokens. > Could it be possible to just add two static policies errorBehavior { > report, skip, ...} and comments { asToken, grouped }? > That way,

Re: std.d.lexer requirements

2012-08-02 Thread Philippe Sigaud
On Fri, Aug 3, 2012 at 5:59 AM, Jonathan M Davis wrote: > On Friday, August 03, 2012 05:36:05 Bernard Helyer wrote: >> If the other guys think they've got it, then I can withdraw my >> efforts. I was just thinking I had a lexer just sitting around, >> may as well use it, but if the other guys have

Re: std.d.lexer requirements

2012-08-02 Thread Philippe Sigaud
On Fri, Aug 3, 2012 at 6:14 AM, Timon Gehr wrote: >> If someone wants to try and write a generic lexer for D and see if they >> can >> beat out any hand-written ones, > > > I'll possibly give it a shot if I can find the time. I propose we let him finish std.lexer, test it with Jonathan, benchmar

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Friday, August 03, 2012 06:14:08 Timon Gehr wrote: > If it is optimal for D lexing and close-optimal or optimal for other > languages then it is profoundly more useful than just a D lexer. If fully concur that we should have a generic lexer, but unless the generic lexer can be just as fast as

Re: std.d.lexer requirements

2012-08-02 Thread Timon Gehr
On 08/03/2012 05:53 AM, Jonathan M Davis wrote: On Thursday, August 02, 2012 23:41:39 Andrei Alexandrescu wrote: On 8/2/12 11:08 PM, Jonathan M Davis wrote: You're not going to get as fast a lexer if it's not written specifically for D. Writing a generic lexer is a different problem. It's also

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Friday, August 03, 2012 06:02:29 Bernard Helyer wrote: > I'll let you get on with it then. I'll amuse myself with the > thought of someone asking why SDC doesn't use std.d.lexer or > a parser generator. I'll then hit them with my cane, and tell > them to get off of my lawn. :P Well, if std.d.le

Re: std.d.lexer requirements

2012-08-02 Thread Bernard Helyer
On Friday, 3 August 2012 at 03:59:29 UTC, Jonathan M Davis wrote: On Friday, August 03, 2012 05:36:05 Bernard Helyer wrote: If the other guys think they've got it, then I can withdraw my efforts. I was just thinking I had a lexer just sitting around, may as well use it, but if the other guys hav

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Friday, August 03, 2012 05:36:05 Bernard Helyer wrote: > If the other guys think they've got it, then I can withdraw my > efforts. I was just thinking I had a lexer just sitting around, > may as well use it, but if the other guys have it, then I'm fine > with withdrawing. I'm a fair ways along

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Thursday, August 02, 2012 23:41:39 Andrei Alexandrescu wrote: > On 8/2/12 11:08 PM, Jonathan M Davis wrote: > > You're not going to get as fast a lexer if it's not written specifically > > for D. Writing a generic lexer is a different problem. It's also one that > > needs to be solved, but I thi

Re: std.d.lexer requirements

2012-08-02 Thread Andrei Alexandrescu
On 8/2/12 11:11 PM, Bernard Helyer wrote: On Friday, 3 August 2012 at 03:00:42 UTC, Andrei Alexandrescu wrote: The lexer must be configurable enough to tokenize other languages than D. You're going to have to defend that one. I wouldn't know how to. To me it's all too obvious it's better to

Re: std.d.lexer requirements

2012-08-02 Thread Andrei Alexandrescu
On 8/2/12 11:08 PM, Jonathan M Davis wrote: You're not going to get as fast a lexer if it's not written specifically for D. Writing a generic lexer is a different problem. It's also one that needs to be solved, but I think that it's a mistake to think that a generic lexer is going to be able to b

Re: std.d.lexer requirements

2012-08-02 Thread Bernard Helyer
On Friday, 3 August 2012 at 03:14:14 UTC, Walter Bright wrote: On 8/2/2012 8:00 PM, Andrei Alexandrescu wrote: On 8/2/12 10:40 PM, Walter Bright wrote: To reiterate another point, since we are in the compiler business, people will expect std.d.lexer to be of top quality, not some bag on the si

Re: std.d.lexer requirements

2012-08-02 Thread Walter Bright
On 8/2/2012 8:00 PM, Andrei Alexandrescu wrote: On 8/2/12 10:40 PM, Walter Bright wrote: To reiterate another point, since we are in the compiler business, people will expect std.d.lexer to be of top quality, not some bag on the side. It needs to be usable as a base for writing a professional qu

Re: std.d.lexer requirements

2012-08-02 Thread Bernard Helyer
On Friday, 3 August 2012 at 03:00:42 UTC, Andrei Alexandrescu wrote: The lexer must be configurable enough to tokenize other languages than D. You're going to have to defend that one.

Re: std.d.lexer requirements

2012-08-02 Thread Timon Gehr
On 08/03/2012 05:08 AM, Jonathan M Davis wrote: On Thursday, August 02, 2012 23:00:41 Andrei Alexandrescu wrote: On 8/2/12 10:40 PM, Walter Bright wrote: To reiterate another point, since we are in the compiler business, people will expect std.d.lexer to be of top quality, not some bag on the s

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Thursday, August 02, 2012 23:00:41 Andrei Alexandrescu wrote: > On 8/2/12 10:40 PM, Walter Bright wrote: > > To reiterate another point, since we are in the compiler business, > > people will expect std.d.lexer to be of top quality, not some bag on the > > side. It needs to be usable as a base f

Re: std.d.lexer requirements

2012-08-02 Thread Andrei Alexandrescu
On 8/2/12 10:40 PM, Walter Bright wrote: To reiterate another point, since we are in the compiler business, people will expect std.d.lexer to be of top quality, not some bag on the side. It needs to be usable as a base for writing a professional quality compiler. It's the reason why I'm pushing m

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Thursday, August 02, 2012 19:40:13 Walter Bright wrote: > No, I'm arguing that the LEXER should accept a UTF8 input range for its > input. I am not making a general argument about ranges, characters, or > Phobos. I think that this is the main point of misunderstanding then. From your comments,

Re: std.d.lexer requirements

2012-08-02 Thread Walter Bright
On 8/2/2012 3:38 PM, Jonathan M Davis wrote: On Thursday, August 02, 2012 15:14:17 Walter Bright wrote: Remember, its the consumer doing the decoding, not the input range. But that's the problem. The consumer has to treat character ranges specially to make this work. It's not generic. If it we

Re: std.d.lexer requirements

2012-08-02 Thread Walter Bright
On 8/2/2012 4:30 PM, Andrei Alexandrescu wrote: I think Walter has very often emphasized the need for the lexer to be faster than the usual client software. My perception is that he's discussing lexer design in understanding there's a need for a less comfortable approach, namely do decoding in cl

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Thursday, August 02, 2012 19:52:35 Jonathan M Davis wrote: > I suppose that we could make it operate on code units and just let ranges of > dchar have UTF-32 as their code unit (since dchar is both a code unit and a > code point), then ranges of dchar will still work but ranges of char and > wch

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Thursday, August 02, 2012 19:30:47 Andrei Alexandrescu wrote: > On 8/2/12 7:18 PM, Jonathan M Davis wrote: > Your insights are always appreciated; even their Cliff notes :o). LOL. Well, I'm not about to decide on the best approach to this without thinking through it more. What I've been doing

Re: std.d.lexer requirements

2012-08-02 Thread Andrei Alexandrescu
On 8/2/12 7:18 PM, Jonathan M Davis wrote: On Thursday, August 02, 2012 19:06:32 Andrei Alexandrescu wrote: Sure, you could have a function which specifically operates on ranges of code units and understands how unicode works and is written accordingly, but then that function is specific to rang

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Thursday, August 02, 2012 19:06:32 Andrei Alexandrescu wrote: > > Sure, you could have a function which specifically operates on ranges of > > code units and understands how unicode works and is written accordingly, > > but then that function is specific to ranges of code units and is only > > g

Re: std.d.lexer requirements

2012-08-02 Thread Andrei Alexandrescu
On 8/2/12 6:54 PM, Jonathan M Davis wrote: So, a function which does the buffering of code units like Walter suggests is generic? Of course, because it operates on bytes read from memory, files, or sockets etc. It's doing something that makes no sense outside of strings. Right. The bytes

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Thursday, August 02, 2012 18:41:23 Andrei Alexandrescu wrote: > On 8/2/12 6:38 PM, Jonathan M Davis wrote: > > On Thursday, August 02, 2012 15:14:17 Walter Bright wrote: > >> Remember, its the consumer doing the decoding, not the input range. > > > > But that's the problem. The consumer has to

Re: std.d.lexer requirements

2012-08-02 Thread Piotr Szturmaj
Walter Bright wrote: On 8/2/2012 1:26 PM, Jonathan M Davis wrote: On Thursday, August 02, 2012 01:44:18 Walter Bright wrote: Keep a 6 character buffer in your consumer. If you read a char with the high bit set, start filling that buffer and then decode it. And how on earth is that going to wo

Re: std.d.lexer requirements

2012-08-02 Thread Andrei Alexandrescu
On 8/2/12 6:38 PM, Jonathan M Davis wrote: On Thursday, August 02, 2012 15:14:17 Walter Bright wrote: Remember, its the consumer doing the decoding, not the input range. But that's the problem. The consumer has to treat character ranges specially to make this work. It's not generic. It is ge

Re: std.d.lexer requirements

2012-08-02 Thread Jonathan M Davis
On Thursday, August 02, 2012 15:14:17 Walter Bright wrote: > Remember, its the consumer doing the decoding, not the input range. But that's the problem. The consumer has to treat character ranges specially to make this work. It's not generic. If it were generic, then it would simply be using fro

Re: std.d.lexer requirements

2012-08-02 Thread Dmitry Olshansky
On 03-Aug-12 02:10, Walter Bright wrote: On 8/2/2012 1:41 PM, Jacob Carlborg wrote: On 2012-08-02 21:35, Walter Bright wrote: A good IDE should do its parsing in a separate thread, so the main user input thread remains crisp and responsive. If the user edits the text while the parsing is in p

Re: std.d.lexer requirements

2012-08-02 Thread Walter Bright
On 8/2/2012 1:26 PM, Jonathan M Davis wrote: On Thursday, August 02, 2012 01:44:18 Walter Bright wrote: On 8/2/2012 1:38 AM, Jonathan M Davis wrote: On Thursday, August 02, 2012 01:14:30 Walter Bright wrote: On 8/2/2012 12:43 AM, Jonathan M Davis wrote: It is for ranges in general. In the gen

  1   2   >