Re: Request for comments: std.d.lexer
On Saturday, 9 February 2013 at 07:55:04 UTC, Walter Bright wrote: On 2/4/2013 7:19 PM, Brian Schott wrote: Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard. I worship the Dark Arts. *recites incantations* *merges pull requests* *adds unit tests* http://hackerpilot.github.com/experimental/std_lexer/images/times4.png
Re: Request for comments: std.d.lexer
On 2013-02-28 16:34, Andrej Mitrovic wrote: On 2/28/13, Brian Schott briancsch...@gmail.com wrote: http://hackerpilot.github.com/experimental/std_lexer/images/times4.png That's a nice plot, but what is the X axis? Lexing time in milliseconds.
Re: Request for comments: std.d.lexer
28-Feb-2013 13:09, Brian Schott пишет: On Saturday, 9 February 2013 at 07:55:04 UTC, Walter Bright wrote: On 2/4/2013 7:19 PM, Brian Schott wrote: Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard. I worship the Dark Arts. *recites incantations* *merges pull requests* *adds unit tests* http://hackerpilot.github.com/experimental/std_lexer/images/times4.png Looking far better now. Keep in mind that we still don't use the original dmd's sentinel magic to avoid length check in std.d.lexer :) Being that wizard who gave a couple of scrolls to Brain I'll outline some interesting data points collected while helping him out. - D run-time takes some time to start up/shut down. It's on the order of 1ms for me vs 1/10th of ms for plain C - expanding on the previous point - GC.malloc takes sometime and triggers collections from time to time (2 collections to lex datetime, there is next to no garbage, but they run regardless). Even simply disabling GC at first and re-enabling it at the end of program speeds up the whole time by roughly 5% on my machine. - opApply is definitely slower then inlined range-based foreach loop even with scope delegate. Other then this specifics the prime spells that give the performance are: - don't use built-in AA or subtle allocations (avoid GC as far as you can) - lean common path, keep there only the operations that are required in *every* code-path - avoid ever doing the same work twice (redesign, hack and whatnot but don't do it) -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
28-Feb-2013 21:27, Dmitry Olshansky пишет: 28-Feb-2013 13:09, Brian Schott пишет: On Saturday, 9 February 2013 at 07:55:04 UTC, Walter Bright wrote: On 2/4/2013 7:19 PM, Brian Schott wrote: Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard. I worship the Dark Arts. *recites incantations* *merges pull requests* *adds unit tests* http://hackerpilot.github.com/experimental/std_lexer/images/times4.png Looking far better now. Keep in mind that we still don't use the original dmd's sentinel magic to avoid length check in std.d.lexer :) Being that wizard who gave a couple of scrolls to Brain I'll outline some interesting data points collected while helping him out. To clarify these are problems I observed but haven't solved/avoided completely: - D run-time takes some time to start up/shut down. It's on the order of 1ms for me vs 1/10th of ms for plain C - expanding on the previous point - GC.malloc takes sometime and triggers collections from time to time (2 collections to lex datetime, there is next to no garbage, but they run regardless). And this one below is an experiment and not is not related to the numbers posted still. Even simply disabling GC at first and re-enabling it at the end of program speeds up the whole time by roughly 5% on my machine. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On 2/4/2013 7:19 PM, Brian Schott wrote: Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard. I worship the Dark Arts.
Re: Request for comments: std.d.lexer
On Saturday, February 09, 2013 08:19:33 deadalnix wrote: So, std.utf.decodeFront pop or not the utf character. And in case it does not, you ends up having to do extra hanky panky (and duplicate logic in std.utf to know if it does pop or not, which is likely to be very bug prone). Well, with the pull request that I have, it'll always pop for input ranges and never for anything else. I'm disinclined to have it pop off elements at all, because I'd like it to function as much like decode as possible, but there's no choice with pure input ranges, and it may be better to just make it always pop the elements off. I'll have to think about it. I hate pure input ranges in general. They're just so frustrating to deal with, and I believe that you can always have a forward range if you're willing to put a bit more effort into it. The lexer that I'm writing doesn't even support pure input ranges, because it has to be able to save to do what it does, and the only reason that decodeFront exists is because I wrote it to support that use case. And needing to operate on ranges of code units is likely to be quite rare, and the function is quite new, so I don't expect that much code is affected by any bugs in decodeFront or that much would be broken were it to be changed. - Jonathan M Davis
Re: Request for comments: std.d.lexer
On Friday, February 08, 2013 23:48:50 Walter Bright wrote: On 2/8/2013 12:01 AM, Jonathan M Davis wrote: It's not quite a use case where ranges shine - especially when efficiency is a top priority. A more problematic case is dmd's lexer relies on a 0 byte at the end to be a sentinel for the end of file. Without such a sentinel, every consumption of a source character requires two operations rather than one. I didn't know that. That's a cute trick. But yeah, without controlling the input, it's not possible, and that won't work with a general implementation in a library. It would be trivial enough to put a wrapper around the input to add the 0 byte at the end, but the wrapper would still have to keep checking for empty, so you wouldn't gain anything. - Jonathan M Davis
Re: Request for comments: std.d.lexer
On Feb 9, 2013 8:01 AM, Walter Bright newshou...@digitalmars.com wrote: On 2/4/2013 7:19 PM, Brian Schott wrote: Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard. I worship the Dark Arts. More like cursed the god's when you wrote it. :-)
Re: Request for comments: std.d.lexer
On 2/9/13 3:07 AM, Jonathan M Davis wrote: On Friday, February 08, 2013 23:48:50 Walter Bright wrote: On 2/8/2013 12:01 AM, Jonathan M Davis wrote: It's not quite a use case where ranges shine - especially when efficiency is a top priority. A more problematic case is dmd's lexer relies on a 0 byte at the end to be a sentinel for the end of file. Without such a sentinel, every consumption of a source character requires two operations rather than one. I didn't know that. That's a cute trick. But yeah, without controlling the input, it's not possible, and that won't work with a general implementation in a library. It would be trivial enough to put a wrapper around the input to add the 0 byte at the end, but the wrapper would still have to keep checking for empty, so you wouldn't gain anything. That's not a problem. You may require that the input has a terminating zero. Andrei
Re: Request for comments: std.d.lexer
On 2013-02-09 09:05, Jonathan M Davis wrote: The lexer that I'm writing doesn't even support pure input ranges, because it has to be able to save to do what it does, and the only reason that decodeFront exists is because I wrote it to support that use case. And needing to operate on ranges of code units is likely to be quite rare, and the function is quite new, so I don't expect that much code is affected by any bugs in decodeFront or that much would be broken were it to be changed. Following this discussion it seems it's complicated to support ranges in a lexer and still maintain the speed. Is it really worth the trouble to support ranges? -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2013-02-08 17:35, Brian Schott wrote: On Friday, 8 February 2013 at 15:35:55 UTC, Dmitry Olshansky wrote: Would be intereesting to try gdc as dmd on linux uses gcc. GDC decided to randomly not fail to build on my system, so it's time for MOAR CHARTS. dmd -inline -noboundscheck -O -w -wi -m64 -property ldc2 -O2 -release -vectorize -m64 gdc -O3 -fno-bounds-check -frelease -m64 http://hackerpilot.github.com/experimental/std_lexer/images/times3-all.png Do you have the exact code you test available somewhere? -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2/9/13 10:05 AM, Jacob Carlborg wrote: On 2013-02-09 09:05, Jonathan M Davis wrote: The lexer that I'm writing doesn't even support pure input ranges, because it has to be able to save to do what it does, and the only reason that decodeFront exists is because I wrote it to support that use case. And needing to operate on ranges of code units is likely to be quite rare, and the function is quite new, so I don't expect that much code is affected by any bugs in decodeFront or that much would be broken were it to be changed. Following this discussion it seems it's complicated to support ranges in a lexer and still maintain the speed. Is it really worth the trouble to support ranges? Requiring a random-access range of ubyte with a terminating zero may be the most general approach to a fast lexer - and that's fine. Andrei
Re: Request for comments: std.d.lexer
On 2013-02-09 15:37, Andrei Alexandrescu wrote: That's not a problem. You may require that the input has a terminating zero. You cannot just append a terminating zero in the lexer if it's a range? -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2013-02-09 16:10, Andrei Alexandrescu wrote: Requiring a random-access range of ubyte with a terminating zero may be the most general approach to a fast lexer - and that's fine. Requiring a random-access range probably makes it easier. People here seems to try to support ranges with less functionality, like input or forward ranges. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2/9/13 10:37 AM, Jacob Carlborg wrote: On 2013-02-09 16:10, Andrei Alexandrescu wrote: Requiring a random-access range of ubyte with a terminating zero may be the most general approach to a fast lexer - and that's fine. Requiring a random-access range probably makes it easier. People here seems to try to support ranges with less functionality, like input or forward ranges. Yah. The way I see it is, start with a random-access range and then see what the use patterns are. Then possibly refine. Andrei
Re: Request for comments: std.d.lexer
On 2/9/13 10:38 AM, Jacob Carlborg wrote: On 2013-02-09 15:37, Andrei Alexandrescu wrote: That's not a problem. You may require that the input has a terminating zero. You cannot just append a terminating zero in the lexer if it's a range? You require it. It's client's burden. Andrei
Re: Request for comments: std.d.lexer
On 2013-02-09 16:46, Andrei Alexandrescu wrote: You require it. It's client's burden. Ok, I see. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On Friday, 8 February 2013 at 16:00:05 UTC, Dmitry Olshansky wrote: 08-Feb-2013 19:52, Iain Buclaw пишет: That's still lies. :o) g++ ? :) -- Iain Buclaw *(p e ? p++ : p) = (c 0x0f) + '0'; I'd think it's built by DMC++, Digital Mars C++ compiler.
Re: Request for comments: std.d.lexer
09-Feb-2013 23:20, SomeDude пишет: On Friday, 8 February 2013 at 16:00:05 UTC, Dmitry Olshansky wrote: 08-Feb-2013 19:52, Iain Buclaw пишет: That's still lies. :o) g++ ? :) -- Iain Buclaw *(p e ? p++ : p) = (c 0x0f) + '0'; I'd think it's built by DMC++, Digital Mars C++ compiler. On linux, sure ;) -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
09-Feb-2013 19:46, Andrei Alexandrescu пишет: On 2/9/13 10:37 AM, Jacob Carlborg wrote: On 2013-02-09 16:10, Andrei Alexandrescu wrote: Requiring a random-access range of ubyte with a terminating zero may be the most general approach to a fast lexer - and that's fine. Requiring a random-access range probably makes it easier. People here seems to try to support ranges with less functionality, like input or forward ranges. Yah. The way I see it is, start with a random-access range and then see what the use patterns are. Then possibly refine. I don't get this. There is no sensible requirement to forbid non-random access. A couple of static ifs and you are done. And I can confidently say that std.d.lexer has quite some room for optimization in both cases and it doesn't have to sacrifice the generic path. I intent to continue hacking on Brain's implementation and to help him refine it. Any real help (as in work and analysis) is appreciated, thanks. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On Sunday, February 10, 2013 00:26:41 Dmitry Olshansky wrote: 09-Feb-2013 19:46, Andrei Alexandrescu пишет: On 2/9/13 10:37 AM, Jacob Carlborg wrote: On 2013-02-09 16:10, Andrei Alexandrescu wrote: Requiring a random-access range of ubyte with a terminating zero may be the most general approach to a fast lexer - and that's fine. Requiring a random-access range probably makes it easier. People here seems to try to support ranges with less functionality, like input or forward ranges. Yah. The way I see it is, start with a random-access range and then see what the use patterns are. Then possibly refine. I don't get this. There is no sensible requirement to forbid non-random access. A couple of static ifs and you are done. And I can confidently say that std.d.lexer has quite some room for optimization in both cases and it doesn't have to sacrifice the generic path. It may end up being less efficient with some types of ranges, but it can still work, and someone's use case may not care about that extra loss of efficiency. Those who really do can use RA ranges or strings or whatever. But if we throw too many extra restrictions on the ranges (like they have to be random access or end with 0), then pretty soon you might as well require that they use string, because the extra benefit of the few extra types of ranges which would work wolud be pretty minimal. - Jonathan M Davis
Re: Request for comments: std.d.lexer
On Tuesday, February 05, 2013 22:51:32 Andrei Alexandrescu wrote: I think it would be reasonable for a lexer to require a range of ubyte as input, and carry its own decoding. In the first approximation it may even require a random-access range of ubyte. Another big issue is the fact that in some ways, using a pointer like dmd's lexer does is actually superior to using a range. In particular, it's trivial to determine where in the text a token is, because you can simply subtract the pointer in the token from the initial pointer. Strings would be okay too, because you can subtract their ptr properties. But the closest that you'll get with ranges is to subtract their lengths, and the only ranges that are likely to define length are random-access ranges. And to do that, you'd either have to keep calculating the index for each token as its generated or save the range with ever token (rather than just having a pointer) so that you could determine the index later if you needed to. And depending on the range, all of that saving could be expensive. And for any other type of range, you'd literally have to count the code units as you iterated in order to figure out what the index is (and you'd have to keep saving the range as you went along if you wanted to slice it at all, since it wouldn't actually be sliceable, and so getting to a particular index in the range would be very expensive even if you kept track of it). And for syntax highlighting and some error reporting and a variety of other uses, you need to be able to determine where in the text a token was (not just its column and line number). And that's simply a lot easier with a pointer. So, dealing with generic ranges is a bit problematic - far more so than any issues with character types. If the lexer is well-written, the extra overhead had be obviated by having the lexer function do stuff a bit differently depending on the type of the range, but regardless, restricting it to strings or pointers would be cleaner in that regard. It's not quite a use case where ranges shine - especially when efficiency is a top priority. - Jonathan M Davis
Re: Request for comments: std.d.lexer
08-Feb-2013 12:01, Jonathan M Davis пишет: On Tuesday, February 05, 2013 22:51:32 Andrei Alexandrescu wrote: I think it would be reasonable for a lexer to require a range of ubyte as input, and carry its own decoding. In the first approximation it may even require a random-access range of ubyte. Another big issue is the fact that in some ways, using a pointer like dmd's lexer does is actually superior to using a range. In particular, it's trivial to determine where in the text a token is, because you can simply subtract the pointer in the token from the initial pointer. Strings would be okay too, because you can subtract their ptr properties. But the closest that you'll get with ranges is to subtract their lengths, and the only ranges that are likely to define length are random-access ranges. Not true, certain ranges know length but can't be random access as indexing is O(lgN) or worse. Including a stripe of chunks as taken from file. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On Friday, February 08, 2013 12:12:30 Dmitry Olshansky wrote: 08-Feb-2013 12:01, Jonathan M Davis пишет: On Tuesday, February 05, 2013 22:51:32 Andrei Alexandrescu wrote: I think it would be reasonable for a lexer to require a range of ubyte as input, and carry its own decoding. In the first approximation it may even require a random-access range of ubyte. Another big issue is the fact that in some ways, using a pointer like dmd's lexer does is actually superior to using a range. In particular, it's trivial to determine where in the text a token is, because you can simply subtract the pointer in the token from the initial pointer. Strings would be okay too, because you can subtract their ptr properties. But the closest that you'll get with ranges is to subtract their lengths, and the only ranges that are likely to define length are random-access ranges. Not true, certain ranges know length but can't be random access as indexing is O(lgN) or worse. Including a stripe of chunks as taken from file. I said that the only ones which are likely to define length are random-access range. There _are_ other ranges which can, but in most cases, if you can know the length, you can do random access as well. Regardless, the main issue still stands in that dealing with keeping track of the index of the code unit of a token is more complicated and generally more expensive with ranges than it is with a pointer. Some range types will do better than others, but short of using a string's ptr property, there's always going to be some additional overhead in comparison to pointers to keep track of the indices or to keep a range or slice of one as part of a token. The pointer's just more lightweight. That doesn't make ranges unacceptable by any means. It just means that they're going to take at least a slight performance hit in comparison to pointers. - Jonathan M Davis
Re: Request for comments: std.d.lexer
08-Feb-2013 13:40, Jonathan M Davis пишет: On Friday, February 08, 2013 12:12:30 Dmitry Olshansky wrote: 08-Feb-2013 12:01, Jonathan M Davis пишет: On Tuesday, February 05, 2013 22:51:32 Andrei Alexandrescu wrote: I think it would be reasonable for a lexer to require a range of ubyte as input, and carry its own decoding. In the first approximation it may even require a random-access range of ubyte. Another big issue is the fact that in some ways, using a pointer like dmd's lexer does is actually superior to using a range. In particular, it's trivial to determine where in the text a token is, because you can simply subtract the pointer in the token from the initial pointer. Strings would be okay too, because you can subtract their ptr properties. But the closest that you'll get with ranges is to subtract their lengths, and the only ranges that are likely to define length are random-access ranges. Not true, certain ranges know length but can't be random access as indexing is O(lgN) or worse. Including a stripe of chunks as taken from file. I said that the only ones which are likely to define length are random-access range. There _are_ other ranges which can, but in most cases, if you can know the length, you can do random access as well. Well I honestly disagree about the promise of knowing length - being able to index. The most ranges is arrays and wrappers on top of these. Given current realities oF D and Phobos I'm afraid you are right though. Regardless, the main issue still stands in that dealing with keeping track of the index of the code unit of a token is more complicated and generally more expensive with ranges than it is with a pointer. If target is random access range just use offset throughout. It's basically becomes base + offset vs base + pointer i.e. non-issue If not then pointer argument no longer applies and you can just as well use separate counter on per popFront. It'd not that costly in this case and flexibility tramps other concerns with forward ranges in any case. Some range types will do better than others, but short of using a string's ptr property, there's always going to be some additional overhead in comparison to pointers to keep track of the indices or to keep a range or slice of one as part of a token. The pointer's just more lightweight. That doesn't make ranges unacceptable by any means. It just means that they're going to take at least a slight performance hit in comparison to pointers. See above. Pointer to something inside of a buffer == index in buffer, typically even with pointer you can't drop the 'buffer' reference itself. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On Tuesday, 5 February 2013 at 19:00:42 UTC, Dmitry Olshansky wrote: Thanks. I've made a pass through the code and found some places to improve. Sadly I've been rather busy at work today. Anyway get ready for pull requests should my ideas prove to be worthy performance-wise. http://hackerpilot.github.com/experimental/std_lexer/images/times3.png Your suggestions seem to have worked. We're below 20ms on my machine for the datetime module.
Re: Request for comments: std.d.lexer
On 2013-02-08 16:08, Brian Schott wrote: http://hackerpilot.github.com/experimental/std_lexer/images/times3.png Your suggestions seem to have worked. We're below 20ms on my machine for the datetime module. DMD is still consistently faster, :( -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On Friday, 8 February 2013 at 15:12:23 UTC, Jacob Carlborg wrote: DMD is still consistently faster, :( We might be getting to the part where we say that code generated by gcc is still consistently faster than code generated by ldc.
Re: Request for comments: std.d.lexer
08-Feb-2013 19:12, Jacob Carlborg пишет: On 2013-02-08 16:08, Brian Schott wrote: http://hackerpilot.github.com/experimental/std_lexer/images/times3.png Your suggestions seem to have worked. We're below 20ms on my machine for the datetime module. DMD is still consistently faster, :( Keep in mind the D runtime start-up cost. In the end on small files DMD always wins because of slim C-runtime. To estimate runtime startup lag I've compared run times of int main(){ return 0; } in C (gcc): Total time (ms): 31637.7 Repetitions: 35000 Sample mode: 0.8 (23649 ocurrences) Median time: 0.873 Avg time : 0.903935 Std dev. : 0.204107 Minimum: 0.552 Maximum: 9.057 95% conf.int. : [0.503892, 1.30398] e = 0.400043 99% conf.int. : [0.378189, 1.42968] e = 0.525746 EstimatedAvg95%: [0.901796, 0.906073] e = 0.00213832 EstimatedAvg99%: [0.901125, 0.906745] e = 0.00281023 void main(){} in D (ldc): Total time (ms): 15429.4 Repetitions: 7000 Sample mode: 2.1 (1785 ocurrences) Median time: 2.128 Avg time : 2.2042 Std dev. : 0.466834 Minimum: 1.286 Maximum: 19.933 95% conf.int. : [1.28922, 3.11918] e = 0.914978 99% conf.int. : [1.00171, 3.40668] e = 1.20248 EstimatedAvg95%: [2.19326, 2.21514] e = 0.0109361 EstimatedAvg99%: [2.18983, 2.21857] e = 0.0143724 dmitry@dmitry-VirtualBox ~ $ I think that the mode could serve as an indicator of the most probable outcome. For me it's around 0.8 vs 2.1 ms. D runtime rides on top of C one + plus the same OS costs to launch a process, so from now on let's keep in mind these extra ~ 1.3 ms of bias in favor of C-based lexer. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On 2013-02-08 16:28, Dmitry Olshansky wrote: 08-Feb-2013 19:12, Jacob Carlborg пишет: On 2013-02-08 16:08, Brian Schott wrote: http://hackerpilot.github.com/experimental/std_lexer/images/times3.png Your suggestions seem to have worked. We're below 20ms on my machine for the datetime module. DMD is still consistently faster, :( Keep in mind the D runtime start-up cost. In the end on small files DMD always wins because of slim C-runtime. To estimate runtime startup lag I've compared run times of int main(){ return 0; } in C (gcc): Total time (ms): 31637.7 Repetitions: 35000 Sample mode: 0.8 (23649 ocurrences) Median time: 0.873 Avg time : 0.903935 Std dev. : 0.204107 Minimum: 0.552 Maximum: 9.057 95% conf.int. : [0.503892, 1.30398] e = 0.400043 99% conf.int. : [0.378189, 1.42968] e = 0.525746 EstimatedAvg95%: [0.901796, 0.906073] e = 0.00213832 EstimatedAvg99%: [0.901125, 0.906745] e = 0.00281023 void main(){} in D (ldc): Total time (ms): 15429.4 Repetitions: 7000 Sample mode: 2.1 (1785 ocurrences) Median time: 2.128 Avg time : 2.2042 Std dev. : 0.466834 Minimum: 1.286 Maximum: 19.933 95% conf.int. : [1.28922, 3.11918] e = 0.914978 99% conf.int. : [1.00171, 3.40668] e = 1.20248 EstimatedAvg95%: [2.19326, 2.21514] e = 0.0109361 EstimatedAvg99%: [2.18983, 2.21857] e = 0.0143724 dmitry@dmitry-VirtualBox ~ $ I think that the mode could serve as an indicator of the most probable outcome. For me it's around 0.8 vs 2.1 ms. D runtime rides on top of C one + plus the same OS costs to launch a process, so from now on let's keep in mind these extra ~ 1.3 ms of bias in favor of C-based lexer. Ok, I see. But wouldn't it be more fair to compare either GCC to GDC or Clang to LDC. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On Friday, 8 February 2013 at 15:23:00 UTC, Brian Schott wrote: On Friday, 8 February 2013 at 15:12:23 UTC, Jacob Carlborg wrote: DMD is still consistently faster, :( We might be getting to the part where we say that code generated by gcc is still consistently faster than code generated by ldc. For the lulz I compiled with dmd -release -inline -noboundscheck -O -m64 -property: http://hackerpilot.github.com/experimental/std_lexer/images/times3-dmd.png Yes, dmd vs ldc actually is a matter of 32 vs 18 ms for datetime.
Re: Request for comments: std.d.lexer
08-Feb-2013 19:34, Brian Schott пишет: On Friday, 8 February 2013 at 15:23:00 UTC, Brian Schott wrote: On Friday, 8 February 2013 at 15:12:23 UTC, Jacob Carlborg wrote: DMD is still consistently faster, :( We might be getting to the part where we say that code generated by gcc is still consistently faster than code generated by ldc. For the lulz I compiled with dmd -release -inline -noboundscheck -O -m64 -property: http://hackerpilot.github.com/experimental/std_lexer/images/times3-dmd.png Yes, dmd vs ldc actually is a matter of 32 vs 18 ms for datetime. Would be intereesting to try gdc as dmd on linux uses gcc. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On 8 February 2013 15:35, Dmitry Olshansky dmitry.o...@gmail.com wrote: 08-Feb-2013 19:34, Brian Schott пишет: On Friday, 8 February 2013 at 15:23:00 UTC, Brian Schott wrote: On Friday, 8 February 2013 at 15:12:23 UTC, Jacob Carlborg wrote: DMD is still consistently faster, :( We might be getting to the part where we say that code generated by gcc is still consistently faster than code generated by ldc. For the lulz I compiled with dmd -release -inline -noboundscheck -O -m64 -property: http://hackerpilot.github.com/**experimental/std_lexer/images/** times3-dmd.pnghttp://hackerpilot.github.com/experimental/std_lexer/images/times3-dmd.png Yes, dmd vs ldc actually is a matter of 32 vs 18 ms for datetime. Would be intereesting to try gdc as dmd on linux uses gcc. What? That's an outright fib. :-) -- Iain Buclaw *(p e ? p++ : p) = (c 0x0f) + '0';
Re: Request for comments: std.d.lexer
On Friday, 8 February 2013 at 15:42:47 UTC, Iain Buclaw wrote: What? That's an outright fib. :-) I think he means that on linux the dmd binary is compiled by gcc
Re: Request for comments: std.d.lexer
On 8 February 2013 15:46, Brian Schott briancsch...@gmail.com wrote: On Friday, 8 February 2013 at 15:42:47 UTC, Iain Buclaw wrote: What? That's an outright fib. :-) I think he means that on linux the dmd binary is compiled by gcc That's still lies. :o) -- Iain Buclaw *(p e ? p++ : p) = (c 0x0f) + '0';
Re: Request for comments: std.d.lexer
08-Feb-2013 19:52, Iain Buclaw пишет: On 8 February 2013 15:46, Brian Schott briancsch...@gmail.com mailto:briancsch...@gmail.com wrote: On Friday, 8 February 2013 at 15:42:47 UTC, Iain Buclaw wrote: What? That's an outright fib. :-) I think he means that on linux the dmd binary is compiled by gcc That's still lies. :o) g++ ? :) -- Iain Buclaw *(p e ? p++ : p) = (c 0x0f) + '0'; -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On 8 February 2013 16:00, Dmitry Olshansky dmitry.o...@gmail.com wrote: 08-Feb-2013 19:52, Iain Buclaw пишет: On 8 February 2013 15:46, Brian Schott briancsch...@gmail.com mailto:briancsch...@gmail.com** wrote: On Friday, 8 February 2013 at 15:42:47 UTC, Iain Buclaw wrote: What? That's an outright fib. :-) I think he means that on linux the dmd binary is compiled by gcc That's still lies. :o) g++ ? :) I see we could be doing this all day. :þ I'll lay down the hint, dmd compiles the source, not gcc. And although gcc may be invoked during a certain special stage of compilation, its actually just a driver to call a certain toolchain program that is outside of gcc. :-) -- Iain Buclaw *(p e ? p++ : p) = (c 0x0f) + '0';
Re: Request for comments: std.d.lexer
On Friday, 8 February 2013 at 15:35:55 UTC, Dmitry Olshansky wrote: Would be intereesting to try gdc as dmd on linux uses gcc. GDC decided to randomly not fail to build on my system, so it's time for MOAR CHARTS. dmd -inline -noboundscheck -O -w -wi -m64 -property ldc2 -O2 -release -vectorize -m64 gdc -O3 -fno-bounds-check -frelease -m64 http://hackerpilot.github.com/experimental/std_lexer/images/times3-all.png
Re: Request for comments: std.d.lexer
On Friday, 8 February 2013 at 16:35:50 UTC, Brian Schott wrote: dmd -inline -noboundscheck -O -w -wi -m64 -property Copy/pate fail. That's actually dmd -release -inline -noboundscheck -O -w -wi -m64 -property
Re: Request for comments: std.d.lexer
On Friday, 8 February 2013 at 16:38:00 UTC, Iain Buclaw wrote: I see we could be doing this all day. :þ We could. I'll lay down the hint, dmd compiles the source, not gcc. And although gcc may be invoked during a certain special stage of compilation, its actually just a driver to call a certain toolchain program that is outside of gcc. :-) What we're saying is that dmd, The Digital Mars D Compiler, is written in C++ and is thus built by GCC/G++. We can tell by looking at the Makefile that DMD doesn't build itself.
Re: Request for comments: std.d.lexer
On 8 February 2013 16:35, Brian Schott briancsch...@gmail.com wrote: On Friday, 8 February 2013 at 15:35:55 UTC, Dmitry Olshansky wrote: Would be intereesting to try gdc as dmd on linux uses gcc. GDC decided to randomly not fail to build on my system, so it's time for MOAR CHARTS. dmd -inline -noboundscheck -O -w -wi -m64 -property ldc2 -O2 -release -vectorize -m64 gdc -O3 -fno-bounds-check -frelease -m64 http://hackerpilot.github.com/**experimental/std_lexer/images/** times3-all.pnghttp://hackerpilot.github.com/experimental/std_lexer/images/times3-all.png Cool. How do you determine the speed times? -- Iain Buclaw *(p e ? p++ : p) = (c 0x0f) + '0';
Re: Request for comments: std.d.lexer
On 8 February 2013 16:41, Brian Schott briancsch...@gmail.com wrote: On Friday, 8 February 2013 at 16:38:00 UTC, Iain Buclaw wrote: I see we could be doing this all day. :þ We could. I'll lay down the hint, dmd compiles the source, not gcc. And although gcc may be invoked during a certain special stage of compilation, its actually just a driver to call a certain toolchain program that is outside of gcc. :-) What we're saying is that dmd, The Digital Mars D Compiler, is written in C++ and is thus built by GCC/G++. We can tell by looking at the Makefile that DMD doesn't build itself. That still has nothing to do with how dmd links D programs. ;) -- Iain Buclaw *(p e ? p++ : p) = (c 0x0f) + '0';
Re: Request for comments: std.d.lexer
On Friday, February 08, 2013 14:29:20 Dmitry Olshansky wrote: If target is random access range just use offset throughout. It's basically becomes base + offset vs base + pointer i.e. non-issue If not then pointer argument no longer applies and you can just as well use separate counter on per popFront. It'd not that costly in this case and flexibility tramps other concerns with forward ranges in any case. I don't know exactly how costly it ends up being (and certainly, poor design choices in other areas could dwarf that cost), but it does incur extra overhead throughout the lexing process. In most code, it wouldn't be a big deal, but the lexer is trying to be lightning fast, so every extra bit like that is going to add up and slow it down. But you really don't have any choice if you don't even have random access. Regardless, my point is that accepting generic ranges rather than pointers complicates things somewhat and does incur at least a slight performance penalty. - Jonathan M Davis
Re: Request for comments: std.d.lexer
08-Feb-2013 23:25, Jonathan M Davis пишет: On Friday, February 08, 2013 14:29:20 Dmitry Olshansky wrote: If target is random access range just use offset throughout. It's basically becomes base + offset vs base + pointer i.e. non-issue If not then pointer argument no longer applies and you can just as well use separate counter on per popFront. It'd not that costly in this case and flexibility tramps other concerns with forward ranges in any case. I don't know exactly how costly it ends up being (and certainly, poor design choices in other areas could dwarf that cost), but it does incur extra overhead throughout the lexing process. In most code, it wouldn't be a big deal, but the lexer is trying to be lightning fast, so every extra bit like that is going to add up and slow it down. I suspect it *might* require extra register for base depending on smartness of the compiler, extra register in turn could rise the cost. The key to speed here however is not in using raw pointers, assembly and/or SIMD. FWIW if there is only a single base pointer + offsets compiler can assume no pointer aliasing and optimize things better. The discussion becomes purely rhetorical unless some hard data is actually presented and not based on DMD's optimizer, please. But you really don't have any choice if you don't even have random access. Agreed. Regardless, my point is that accepting generic ranges rather than pointers complicates things somewhat and does incur at least a slight performance penalty. Complication - yes, slight performance cost is what I doubt it in RA case. Seems like a compiler/optimizer issue. For one I used to write cycles with naked pointers in C to get better performance out of crappy compilers. I won't ever do it today as it would get worse performance in addition to being less readable (I've checked at least on VC++ about a year ago, compiler rewrites these index-based loops better, YMMV). -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
08-Feb-2013 20:51, Iain Buclaw пишет: On 8 February 2013 16:41, Brian Schott briancsch...@gmail.com mailto:briancsch...@gmail.com wrote: On Friday, 8 February 2013 at 16:38:00 UTC, Iain Buclaw wrote: I see we could be doing this all day. :þ We could. I'll lay down the hint, dmd compiles the source, not gcc. And although gcc may be invoked during a certain special stage of compilation, its actually just a driver to call a certain toolchain program that is outside of gcc. :-) What we're saying is that dmd, The Digital Mars D Compiler, is written in C++ and is thus built by GCC/G++. We can tell by looking at the Makefile that DMD doesn't build itself. That still has nothing to do with how dmd links D programs. ;) We've been discussing DMD's lexer written in C++ that is obviously compiled using the default compiler on target platform. That's what is being measured the good ol' C++ lexer vs D lexer. I don't see how the way DMD does linking is related to what tool is used to actually build it. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On 2013-02-08 17:35, Brian Schott wrote: GDC decided to randomly not fail to build on my system, so it's time for MOAR CHARTS. Interesting. Seems that LDC is faster than GDC but has a bigger start overhead. OT: Why is the chart's legend in reverse order? :P
Re: Request for comments: std.d.lexer
On Friday, February 08, 2013 23:50:13 Dmitry Olshansky wrote: Complication - yes, slight performance cost is what I doubt it in RA case. Seems like a compiler/optimizer issue. The main problem isn't RA ranges. It's quite probable that the optimizer takes care of any minor additional overhead that's incurred there. The issue is really pure forward ranges, because you're forced to count the code units as they're processed (or even save the range as you go along, given how expensive it would be to go find the index even if you have it). But we really have no way of knowing how bad it is without hard data. It would certainly be trivial to end up doing other stuff in the implementation which cost far more. - Jonathan M Davis
Re: Request for comments: std.d.lexer
On Friday, 8 February 2013 at 20:41:52 UTC, FG wrote: On 2013-02-08 17:35, Brian Schott wrote: GDC decided to randomly not fail to build on my system, so it's time for MOAR CHARTS. Interesting. Seems that LDC is faster than GDC but has a bigger start overhead. OT: Why is the chart's legend in reverse order? :P It seems like that, but the differences are small enough that you can't make reliable statements without having a closer look at the statistics. David, who dreams of a world in which every graph has (at least) error bars
Re: Request for comments: std.d.lexer
Am 08.02.2013 17:35, schrieb Brian Schott: On Friday, 8 February 2013 at 15:35:55 UTC, Dmitry Olshansky wrote: Would be intereesting to try gdc as dmd on linux uses gcc. GDC decided to randomly not fail to build on my system, so it's time for MOAR CHARTS. dmd -inline -noboundscheck -O -w -wi -m64 -property ldc2 -O2 -release -vectorize -m64 gdc -O3 -fno-bounds-check -frelease -m64 http://hackerpilot.github.com/experimental/std_lexer/images/times3-all.png i still don't get what you comparing here you benchmarking something like an mixup of code-generation, startup-times and filesystem behavior + lexing time (+ lexing output style) and this all makes only sense if you lexer produces the very same output (so it can be out of the box used as an replacement) as the dmd lexer - for beeing compareable or aren't you in the phase of detail benchmarking/profiling?
Re: Request for comments: std.d.lexer
On Wednesday, 6 February 2013 at 03:51:33 UTC, Andrei Alexandrescu wrote: I think it would be reasonable for a lexer to require a range of ubyte as input, and carry its own decoding. In the first approximation it may even require a random-access range of ubyte. Playing around that, I discovered a bug in std.utf : slice and other range are not threated the same way by decodeFront, which is rather problematic. Jonathan also hit that bug : http://d.puremagic.com/issues/show_bug.cgi?id=9456 That make the lexer hard to write with a consistent behavior for InputRanges. The bug probably exists in everything that rely on decodeFront at some point.
Re: Request for comments: std.d.lexer
On Friday, 8 February 2013 at 20:54:32 UTC, Jonathan M Davis wrote: On Friday, February 08, 2013 23:50:13 Dmitry Olshansky wrote: Complication - yes, slight performance cost is what I doubt it in RA case. Seems like a compiler/optimizer issue. The main problem isn't RA ranges. It's quite probable that the optimizer takes care of any minor additional overhead that's incurred there. The issue is really pure forward ranges, because you're forced to count the code units as they're processed (or even save the range as you go along, given how expensive it would be to go find the index even if you have it). But we really have no way of knowing how bad it is without hard data. It would certainly be trivial to end up doing other stuff in the implementation which cost far more. For pure InputRanges, that is pretty bad as the decoding of UTF chars basically have to be done 2 times, and each codepoint popped individually, or the lexer have to embed its own homebrew version of std.utf .
Re: Request for comments: std.d.lexer
On Saturday, 9 February 2013 at 07:15:57 UTC, deadalnix wrote: On Friday, 8 February 2013 at 20:54:32 UTC, Jonathan M Davis wrote: On Friday, February 08, 2013 23:50:13 Dmitry Olshansky wrote: Complication - yes, slight performance cost is what I doubt it in RA case. Seems like a compiler/optimizer issue. The main problem isn't RA ranges. It's quite probable that the optimizer takes care of any minor additional overhead that's incurred there. The issue is really pure forward ranges, because you're forced to count the code units as they're processed (or even save the range as you go along, given how expensive it would be to go find the index even if you have it). But we really have no way of knowing how bad it is without hard data. It would certainly be trivial to end up doing other stuff in the implementation which cost far more. For pure InputRanges, that is pretty bad as the decoding of UTF chars basically have to be done 2 times, and each codepoint popped individually, or the lexer have to embed its own homebrew version of std.utf . Wow that is complete bullshit xD I completely messed up what I wanted to say. So, std.utf.decodeFront pop or not the utf character. And in case it does not, you ends up having to do extra hanky panky (and duplicate logic in std.utf to know if it does pop or not, which is likely to be very bug prone).
Re: Request for comments: std.d.lexer
On 2/8/2013 12:01 AM, Jonathan M Davis wrote: It's not quite a use case where ranges shine - especially when efficiency is a top priority. A more problematic case is dmd's lexer relies on a 0 byte at the end to be a sentinel for the end of file. Without such a sentinel, every consumption of a source character requires two operations rather than one.
Re: Request for comments: std.d.lexer
On 2013-02-05 04:22, Andrei Alexandrescu wrote: Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par. There's reason for why nobody has just extract the lexer from DMD. It will probably take more than a day just to extract the lexer to be able to use it without the rest of DMD. It's probably easier to do the actual porting than extract the lexer. Actually, when I think about it, Johnathan is working on porting the DMD lexer to D. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 02/05/2013 07:19 AM, Brian Schott wrote: More optimizing: http://hackerpilot.github.com/experimental/std_lexer/images/times2.png Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard. Time to do some hacking on your lexer I guess. I'll try add a couple of tricks and see if it helps. What command do you use for benchmarking?
Re: Request for comments: std.d.lexer
On Tuesday, February 05, 2013 09:14:35 Jacob Carlborg wrote: Actually, when I think about it, Johnathan is working on porting the DMD lexer to D. Not exactly. I decided that it would be of greater benefit to just write the thing according to the grammar and make sure that the compiler matched the spec. I've already found and fixed several bugs in the spec because of that. Also, given how I'm writing it, I expect that its speed will be similar to that of dmd given the fact that I'm writing it such that it will generally do the minimum number of operations required. But it wouldn't surprise me at all if no lexer can possibly match dmd at the moment as long as it's compiled with dmd (at least on Linux), because gcc's optimizer is so much better than dmd's, and dmd is getting compiled with gcc, whereas the competing lexer in D is being compiled with dmd. I don't have a lot of time to work on my lexer at the moment, but I'd really like to get it done soon, and I have most of the features in place. Unfortunately, when I went to try and work on it again the other day, the code wasn't compiling anymore, and I need to figure out why. I suspect that it's a regression related to string mixins, but I have to investigate further to sort it out. - Jonathan M Davis
Re: Request for comments: std.d.lexer
On 2013-02-05 10:07, Jonathan M Davis wrote: Not exactly. I decided that it would be of greater benefit to just write the thing according to the grammar and make sure that the compiler matched the spec. Aha, I see. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On Tuesday, February 05, 2013 09:14:35 Jacob Carlborg wrote: On 2013-02-05 04:22, Andrei Alexandrescu wrote: Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par. There's reason for why nobody has just extract the lexer from DMD. It will probably take more than a day just to extract the lexer to be able to use it without the rest of DMD. There are basic ideas about how it works which are obviously good and should be in the finished product in D, but it's not range-based, which forces you to do things differently. It's also not configurable, which forces you to do things differently. If it could be ported as-is and then compared for speed, then that would be a great test, since it would be able to show how much of the speed problem is purely a compiler issue as opposed to a design issue, but you wouldn't be able to actually use it for anything more than what Brian is doing with his performance testing, because as you point out, it's too integrated into dmd. It _would_ be valuable though as a performance test of the compiler. - Jonathan M Davis
Re: Request for comments: std.d.lexer
On Tuesday, February 05, 2013 01:07:48 Jonathan M Davis wrote: I don't have a lot of time to work on my lexer at the moment, but I'd really like to get it done soon, and I have most of the features in place. Unfortunately, when I went to try and work on it again the other day, the code wasn't compiling anymore, and I need to figure out why. I suspect that it's a regression related to string mixins, but I have to investigate further to sort it out. It turns out that it has nothing to do with string mixins (though it does have to do with CTFE): http://d.puremagic.com/issues/show_bug.cgi?id=9452 Fortunately, there's a simple workaround that'll let me continue until the bug is fixed. - Jonathan M Davis
Re: Request for comments: std.d.lexer
On 2013-02-05 11:44, Jonathan M Davis wrote: If it could be ported as-is and then compared for speed, then that would be a great test, since it would be able to show how much of the speed problem is purely a compiler issue as opposed to a design issue, but you wouldn't be able to actually use it for anything more than what Brian is doing with his performance testing, because as you point out, it's too integrated into dmd. It _would_ be valuable though as a performance test of the compiler. Yeah, that could be useful. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2013-02-05 11:46, Jonathan M Davis wrote: It turns out that it has nothing to do with string mixins (though it does have to do with CTFE): http://d.puremagic.com/issues/show_bug.cgi?id=9452 Fortunately, there's a simple workaround that'll let me continue until the bug is fixed. Ok, good that it's not a blocker. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2/5/13 12:45 AM, deadalnix wrote: On Tuesday, 5 February 2013 at 03:22:52 UTC, Andrei Alexandrescu wrote: On 2/4/13 10:19 PM, Brian Schott wrote: More optimizing: http://hackerpilot.github.com/experimental/std_lexer/images/times2.png Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard. Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par. DMD's lexer is not suitable for phobos IMO. It doesn't take a range as input and don't produce a range. It also lack the features you may want from a multi usage D lexer. I looked at the code. Once converted, it's trivial to convert the lexer into an input range. Andrei
Re: Request for comments: std.d.lexer
On 2/5/13 5:44 AM, Jonathan M Davis wrote: On Tuesday, February 05, 2013 09:14:35 Jacob Carlborg wrote: On 2013-02-05 04:22, Andrei Alexandrescu wrote: Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par. There's reason for why nobody has just extract the lexer from DMD. It will probably take more than a day just to extract the lexer to be able to use it without the rest of DMD. There are basic ideas about how it works which are obviously good and should be in the finished product in D, but it's not range-based, which forces you to do things differently. It's also not configurable, which forces you to do things differently. If it could be ported as-is and then compared for speed, then that would be a great test, since it would be able to show how much of the speed problem is purely a compiler issue as opposed to a design issue, but you wouldn't be able to actually use it for anything more than what Brian is doing with his performance testing, because as you point out, it's too integrated into dmd. It _would_ be valuable though as a performance test of the compiler. As far as I could tell the dependencies of the lexer are fairly contained (util, token, identifier) and conversion to input range is immediate. Andrei
Re: Request for comments: std.d.lexer
On 2013-02-05 14:34, Andrei Alexandrescu wrote: As far as I could tell the dependencies of the lexer are fairly contained (util, token, identifier) and conversion to input range is immediate. That's not what I remember from last time I gave it a try. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2/5/13, Brian Schott briancsch...@gmail.com wrote: I gave up on getting it to compile. It seems master is broken. An earlier commit is buildable with a small change, but it doesn't seem to parse modules like std.datetime. I don't know what exact version the front-end was ported from, but I've tried to parse datetime from 2.055 to 2.061 and it didn't work. So yeah it's not usable in this state.
Re: Request for comments: std.d.lexer
On Tuesday, 5 February 2013 at 08:52:53 UTC, Dmitry Olshansky wrote: Time to do some hacking on your lexer I guess. I'll try add a couple of tricks and see if it helps. What command do you use for benchmarking? I've been using avgtime[1] for measuring run times, perf[2], and callgrind/kcachegrind[3] for profiling. [1] https://github.com/jmcabo/avgtime [2] https://perf.wiki.kernel.org/index.php/Tutorial [3] http://kcachegrind.sourceforge.net/html/Home.html
Re: Request for comments: std.d.lexer
05-Feb-2013 22:25, Brian Schott пишет: On Tuesday, 5 February 2013 at 08:52:53 UTC, Dmitry Olshansky wrote: Time to do some hacking on your lexer I guess. I'll try add a couple of tricks and see if it helps. What command do you use for benchmarking? I've been using avgtime[1] for measuring run times, perf[2], and callgrind/kcachegrind[3] for profiling. [1] https://github.com/jmcabo/avgtime [2] https://perf.wiki.kernel.org/index.php/Tutorial [3] http://kcachegrind.sourceforge.net/html/Home.html Thanks. I've made a pass through the code and found some places to improve. Sadly I've been rather busy at work today. Anyway get ready for pull requests should my ideas prove to be worthy performance-wise. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On Tuesday, February 05, 2013 08:34:29 Andrei Alexandrescu wrote: As far as I could tell the dependencies of the lexer are fairly contained (util, token, identifier) and conversion to input range is immediate. I don't remember all of the details at the moment, since it's been several months since I looked at dmd's lexer, but a lot of the problem stems from the fact that it's all written around the assumption that it's dealing with a char*. Converting it to operate on string might be fairly straightforward, but it gets more complicated when dealing with ranges. Also, both Walter and others have stated that the lexer in D should be configurable in a number of ways, and dmd's lexer isn't configurable at all. So, while a direct translation would likely be quick, refactoring it to do what it's been asked to be able to do would not be. I'm quite a ways along with one that's written from scratch, but I need to find the time to finish it. Also, doing it from scratch has had the added benefit of helping me find bugs in the spec and in dmd. - Jonathan M Davis
Re: Request for comments: std.d.lexer
On 2/5/13 10:29 PM, Jonathan M Davis wrote: On Tuesday, February 05, 2013 08:34:29 Andrei Alexandrescu wrote: As far as I could tell the dependencies of the lexer are fairly contained (util, token, identifier) and conversion to input range is immediate. I don't remember all of the details at the moment, since it's been several months since I looked at dmd's lexer, but a lot of the problem stems from the fact that it's all written around the assumption that it's dealing with a char*. Converting it to operate on string might be fairly straightforward, but it gets more complicated when dealing with ranges. Also, both Walter and others have stated that the lexer in D should be configurable in a number of ways, and dmd's lexer isn't configurable at all. So, while a direct translation would likely be quick, refactoring it to do what it's been asked to be able to do would not be. I'm quite a ways along with one that's written from scratch, but I need to find the time to finish it. Also, doing it from scratch has had the added benefit of helping me find bugs in the spec and in dmd. I think it would be reasonable for a lexer to require a range of ubyte as input, and carry its own decoding. In the first approximation it may even require a random-access range of ubyte. Andrei
Re: Request for comments: std.d.lexer
On Tuesday, February 05, 2013 22:51:32 Andrei Alexandrescu wrote: I think it would be reasonable for a lexer to require a range of ubyte as input, and carry its own decoding. In the first approximation it may even require a random-access range of ubyte. I'd have to think about how you'd handle the Unicode stuff in that case, since I'm not quite sure what you mean by having it handle its own decoding if it's a range of code units, but what I've been working on works with all of the character types and is very careful about how it deals with decoding in order to avoid unnecessary decoding. And that wasn't all that hard as far as the lexer's code goes. The hard part with that was making std.utf work with ranges of code units rather than just strings, and that was committed months ago. - Jonathan M Davis
Re: Request for comments: std.d.lexer
On 2013-02-04 01:50, FG wrote: Ah, fine. Then it's not that bad. :) Have you already made it use slices of the whole source, without copying the string values of tokens? What kind of optimizations have you made? That would be interesting to hear. Three times slower than DMD doesn't sound good. I know that DMD is fast, but three times. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On Monday, 4 February 2013 at 00:22:42 UTC, Brian Schott wrote: After several hours of optimizing I've managed to make it so that dmd's lexer is only three times faster. http://hackerpilot.github.com/experimental/std_lexer/images/times.png The funny thing is that compiling with LDC gave a bigger speed boost than any of my code refacatoring. Where is the current bottleneck? (Should be easy to find just by running the program ~5 times and suddenly breaking into it with a debugger.) Also, I'm assuming you've already tried disabling range-checking on arrays?
Re: Request for comments: std.d.lexer
On 2013-02-04 08:57, Jacob Carlborg wrote: On 2013-02-04 01:50, FG wrote: Ah, fine. Then it's not that bad. :) Have you already made it use slices of the whole source, without copying the string values of tokens? What kind of optimizations have you made? That would be interesting to hear. Three times slower than DMD doesn't sound good. I know that DMD is fast, but three times. Looking at the current source, there is now a StringCache to hold the strings. It is however also used to store all those long comments, so I believe quite some time is unnecessarily wasted on generating hashes for them. But probably there are other parts slowing things down.
Re: Request for comments: std.d.lexer
More optimizing: http://hackerpilot.github.com/experimental/std_lexer/images/times2.png Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard.
Re: Request for comments: std.d.lexer
On 2/4/13 10:19 PM, Brian Schott wrote: More optimizing: http://hackerpilot.github.com/experimental/std_lexer/images/times2.png Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard. Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par. Andrei
Re: Request for comments: std.d.lexer
On 2/5/13, Andrei Alexandrescu seewebsiteforem...@erdani.org wrote: Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par. This was already done for DDMD, and the more recent minimal version of it: https://github.com/zachthemystic/ddmd-clean/blob/master/dmd/lexer.d
Re: Request for comments: std.d.lexer
On 2/4/13 11:05 PM, Andrej Mitrovic wrote: On 2/5/13, Andrei Alexandrescuseewebsiteforem...@erdani.org wrote: Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par. This was already done for DDMD, and the more recent minimal version of it: https://github.com/zachthemystic/ddmd-clean/blob/master/dmd/lexer.d Awesome! Has anyone measured the speed? Andrei
Re: Request for comments: std.d.lexer
On Tuesday, 5 February 2013 at 04:19:54 UTC, Andrei Alexandrescu wrote: Awesome! Has anyone measured the speed? Andrei I gave up on getting it to compile. ddmd (the project this one is based on) was last updated to 2.040 according to its page on dsource, and I also haven't gotten it to compile.
Re: Request for comments: std.d.lexer
On Tuesday, 5 February 2013 at 03:22:52 UTC, Andrei Alexandrescu wrote: On 2/4/13 10:19 PM, Brian Schott wrote: More optimizing: http://hackerpilot.github.com/experimental/std_lexer/images/times2.png Still only half speed. I'm becoming more and more convinced that Walter is actually a wizard. Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par. DMD's lexer is not suitable for phobos IMO. It doesn't take a range as input and don't produce a range. It also lack the features you may want from a multi usage D lexer.
Re: Request for comments: std.d.lexer
03-Feb-2013 05:30, Walter Bright пишет: On 2/2/2013 1:54 PM, Dmitry Olshansky wrote: I think I failed to describe the virtual memory trick or you just ignored it? You can't just reserve 1Gb of address space in a 32 bit system. (I know about reserving address space vs committing memory.) OK, sorry you must knew this stuff throughly. 1Gb is not the point I feel like I shouldn't been listing 1Gb at all as we need much less. It's about 21 Mb of D source code in full dmd test suite + druntime + phobos, and only 4M in vibe.d including all examples. Reserve this amount. It's the same amount of memory as in read them all and slice them all approach that is said to be so fast and practical. And yet we don't commit this RAM at all most of the time. Thinking more of 32bit systems that are so tight on address space - it's 2Gb for user-mode. Precisely because of that a signed 32 bit offset is enough to address RAM if we store it in multiple chunks like you said. The fact that there is only 2Gb of user-space actually sort of helps us as we surely can reach the other chunks from a single base. In case of 3Gb (some linux-es and/or large address aware win32) we just need to reserve the first range somewhere in the middle, as we do it at startup this should be easy with a few probes. Also after the lexing you can always free the address space and reallocate-compact it as the last step like I said earlier. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
After several hours of optimizing I've managed to make it so that dmd's lexer is only three times faster. http://hackerpilot.github.com/experimental/std_lexer/images/times.png The funny thing is that compiling with LDC gave a bigger speed boost than any of my code refacatoring.
Re: Request for comments: std.d.lexer
On 2013-02-04 01:22, Brian Schott wrote: After several hours of optimizing I've managed to make it so that dmd's lexer is only three times faster. What are you comparing here? How do you time DMD's lexing stage?
Re: Request for comments: std.d.lexer
On Monday, 4 February 2013 at 00:38:57 UTC, FG wrote: On 2013-02-04 01:22, Brian Schott wrote: After several hours of optimizing I've managed to make it so that dmd's lexer is only three times faster. What are you comparing here? How do you time DMD's lexing stage? A simple hack of module.c that prints out the file's token count and then calls exit(0).
Re: Request for comments: std.d.lexer
On 2013-02-04 01:41, Brian Schott wrote: What are you comparing here? How do you time DMD's lexing stage? A simple hack of module.c that prints out the file's token count and then calls exit(0). Ah, fine. Then it's not that bad. :) Have you already made it use slices of the whole source, without copying the string values of tokens? What kind of optimizations have you made?
Re: Request for comments: std.d.lexer
02-Feb-2013 01:10, Walter Bright пишет: On 2/1/2013 3:22 AM, Dmitry Olshansky wrote: 01-Feb-2013 15:05, Walter Bright пишет: On 1/30/2013 8:44 AM, Dmitry Olshansky wrote: In allocation scheme I proposed that ID could be a 32bit offset into the unique identifiers chunk. That only works if you know in advance the max size the chunk can ever be and preallocate it. Otherwise, you have no guarantee that the next allocated chunk will be within 32 bits of address of the previous chunks. Well I supposed it's exactly one reallocatable block. Then token have an offset that doesn't care if the block was reallocated. Or rather the reallocating just RESERVE virtual RAM for it (say 1G), and COMMIT it page by page when you need to grow it. Once lexing is done, shrink virtual region to the actual used size to free up address space (e.g. if we are on 32bits). AS for 32bit limit that gives 4Gb maximum of the cumulative length of all unique identifier names is more then enough by any standard. I haven't seen a 4G codebase not to speak of identifiers alone that even if we count all the repetitions separately. Your technique can work, provided the number of identifiers isn't large enough that memory fragmentation will prevent being able to reallocate the buffer to a larger size. I think I failed to describe the virtual memory trick or you just ignored it? But regardless I have to correct myself on both the amount to reserve and the fact that it's actually not virtual memory alone but a virtue of MMU. See below a full program in D see it the trick in action, currently Win32 only, as I can't recall the right POSIX calls offhand. It's a benchmark against built-in array/appender - the heap stuff. The speed is simillar, with my quick and dirty stuff being much faster iff arrays don't allocate all of ram beforehand. The steps to snatch the power of not having any reallocations: 1. Reserve a contiguous *address space* range. Not even a bit of virtual memory is reserved/commited/touched at this moment. This just creates a TLB entry AFAICT. (for lexer this address space range would have length equal to the sum of all files length as Tove points out, this is the first correction) 2. Any time there is no memory to put more stuff into (as is initially), commit the next few pages. At this moment virtual ram is committed but not allocated. 3. [this I think we all know] The moment memory location in the committed part of the range is touched the page for it is allocated (and on win32 initially contains zeros automatically). Only at this point Virtual memory is allocated So you see where my 2nd correction goes. In essence you get: 0 reallocations and no more then 1 page of virtual memory wasted in all cases. Sweet deal ;) Then there is an optional step if you think that too much of address space is used (makes sense only on 32-bits if at all): copy the resulting block of actual data elsewhere in the heap and release the whole address range. The code: (the same as github gist: https://gist.github.com/4699388 ) ///Written in the D programming language // Bench built-in array append, std.array.Appender // and custom virtual-memory aware VirtualArray in terms of // memory usage and the time taken // to append up to X megabytes using different chunk sizes: // 16 bytes, 64bytes, 256 bytes and 16Kb at a time. // pass this version to take insert readln's // at interesting points and investigate RAM usage // make sure to enable relevant columns in task manager process details: // committed size and private working set // version = diagnose; import std.c.windows.windows; import std.exception, std.datetime, std.algorithm, std.range, std.array; auto roundUpToPowerOf2(size_t var, size_t pow2) { return (var + pow2-1) ~(pow2-1); } struct VirtualArray { private: ubyte* pointer; // to the beginning of the reserved memory size_t _length; // length of data size_t commited; // committed memory so far size_t reserved; // total possible maximum to grow to size_t pageSize; // page size, this could be global //size_t pageSize; // commit some more memory from the reserved range void extend(size_t size) { size = roundUpToPowerOf2(size, pageSize); // this will throw once run out of reserved pages enforce(VirtualAlloc(pointer+commited, size, MEM_COMMIT, PAGE_READWRITE)); commited += size; } public: this(size_t maxCapacity) { SYSTEM_INFO sysinfo; GetSystemInfo(sysinfo); pageSize = sysinfo.dwPageSize; // the page size is a power of 2 // round the capacity up to a multiple of page size maxCapacity = roundUpToPowerOf2(maxCapacity, pageSize); pointer = cast(ubyte*)enforce(VirtualAlloc(null, maxCapacity, MEM_RESERVE, PAGE_READWRITE)); commited = 0; reserved = maxCapacity; _length = 0; } // bare minimum
Re: Request for comments: std.d.lexer
On 2/2/2013 1:54 PM, Dmitry Olshansky wrote: I think I failed to describe the virtual memory trick or you just ignored it? You can't just reserve 1Gb of address space in a 32 bit system. (I know about reserving address space vs committing memory.)
Re: Request for comments: std.d.lexer
On Friday, 1 February 2013 at 07:41:11 UTC, qznc wrote: On Thursday, 31 January 2013 at 12:14:35 UTC, Jacob Carlborg wrote: Just thinking out loud here. Would it be possible to lex a file in parallel? Cutting it in half (or similar) and lex both pieces simultaneously in parallel. I think a lexer should be IO-bound on todays machines, so parallelizing should not give much benefits. Pretty sure RAMs these days can handle more than 1 processor's memory request at a given point in time, so there should be an N-times speedup, where N maxes out at the max throughput...
Re: Request for comments: std.d.lexer
On 2013-02-01 08:41, qznc wrote: I think a lexer should be IO-bound on todays machines, so parallelizing should not give much benefits. I'm not sure I understand. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 1/30/2013 8:44 AM, Dmitry Olshansky wrote: In allocation scheme I proposed that ID could be a 32bit offset into the unique identifiers chunk. That only works if you know in advance the max size the chunk can ever be and preallocate it. Otherwise, you have no guarantee that the next allocated chunk will be within 32 bits of address of the previous chunks.
Re: Request for comments: std.d.lexer
01-Feb-2013 15:05, Walter Bright пишет: On 1/30/2013 8:44 AM, Dmitry Olshansky wrote: In allocation scheme I proposed that ID could be a 32bit offset into the unique identifiers chunk. That only works if you know in advance the max size the chunk can ever be and preallocate it. Otherwise, you have no guarantee that the next allocated chunk will be within 32 bits of address of the previous chunks. Well I supposed it's exactly one reallocatable block. Then token have an offset that doesn't care if the block was reallocated. Or rather the reallocating just RESERVE virtual RAM for it (say 1G), and COMMIT it page by page when you need to grow it. Once lexing is done, shrink virtual region to the actual used size to free up address space (e.g. if we are on 32bits). AS for 32bit limit that gives 4Gb maximum of the cumulative length of all unique identifier names is more then enough by any standard. I haven't seen a 4G codebase not to speak of identifiers alone that even if we count all the repetitions separately. -- Dmitry Olshansky
Re: Request for comments: std.d.lexer
On Friday, 1 February 2013 at 11:06:02 UTC, Walter Bright wrote: On 1/30/2013 8:44 AM, Dmitry Olshansky wrote: In allocation scheme I proposed that ID could be a 32bit offset into the unique identifiers chunk. That only works if you know in advance the max size the chunk can ever be and preallocate it. Otherwise, you have no guarantee that the next allocated chunk will be within 32 bits of address of the previous chunks. This can easily be archived by preallocating file.size bytes... it will be x orders of magnitude too much, but it doesn't matter, as in the end only the cache locality matters.
Re: Request for comments: std.d.lexer
On 2/1/2013 3:22 AM, Dmitry Olshansky wrote: 01-Feb-2013 15:05, Walter Bright пишет: On 1/30/2013 8:44 AM, Dmitry Olshansky wrote: In allocation scheme I proposed that ID could be a 32bit offset into the unique identifiers chunk. That only works if you know in advance the max size the chunk can ever be and preallocate it. Otherwise, you have no guarantee that the next allocated chunk will be within 32 bits of address of the previous chunks. Well I supposed it's exactly one reallocatable block. Then token have an offset that doesn't care if the block was reallocated. Or rather the reallocating just RESERVE virtual RAM for it (say 1G), and COMMIT it page by page when you need to grow it. Once lexing is done, shrink virtual region to the actual used size to free up address space (e.g. if we are on 32bits). AS for 32bit limit that gives 4Gb maximum of the cumulative length of all unique identifier names is more then enough by any standard. I haven't seen a 4G codebase not to speak of identifiers alone that even if we count all the repetitions separately. Your technique can work, provided the number of identifiers isn't large enough that memory fragmentation will prevent being able to reallocate the buffer to a larger size.
Re: Request for comments: std.d.lexer
On 2013-01-27 10:51, Brian Schott wrote: I'm writing a D lexer for possible inclusion in Phobos. DDOC: http://hackerpilot.github.com/experimental/std_lexer/phobos/lexer.html Code: https://github.com/Hackerpilot/Dscanner/blob/range-based-lexer/std/d/lexer.d It's currently able to correctly syntax highlight all of Phobos, but does a fairly bad job at rejecting or notifying users/callers about invalid input. I'd like to hear arguments on the various ways to handle errors in the lexer. In a compiler it would be useful to throw an exception on finding something like a string literal that doesn't stop before EOF, but a text editor or IDE would probably want to be a bit more lenient. Maybe having it run-time (or compile-time configurable) like std.csv would be the best option here. I'm interested in ideas on the API design and other high-level issues at the moment. I don't consider this ready for inclusion. (The current module being reviewed for inclusion in Phobos is the new std.uni.) Just thinking out loud here. Would it be possible to lex a file in parallel? Cutting it in half (or similar) and lex both pieces simultaneously in parallel. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2013-01-30 10:49, Brian Schott wrote: Results: $ avgtime -q -r 200 ./dscanner --tokenCount ../phobos/std/datetime.d Total time (ms): 13861.8 Repetitions: 200 Sample mode: 69 (90 ocurrences) Median time: 69.0745 Avg time : 69.3088 Std dev. : 0.670203 Minimum: 68.613 Maximum: 72.635 95% conf.int. : [67.9952, 70.6223] e = 1.31357 99% conf.int. : [67.5824, 71.0351] e = 1.72633 EstimatedAvg95%: [69.2159, 69.4016] e = 0.0928836 EstimatedAvg99%: [69.1867, 69.4308] e = 0.12207 If my math is right, that means it's getting 4.9 million tokens/second now. According to Valgrind the only way to really improve things now is to require that the input to the lexer support slicing. (Remember the secret of Tango's XML parser...) The bottleneck is now on the calls to .idup to construct the token strings from slices of the buffer. How many tokens would that be in total? -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2013-01-31 13:14, Jacob Carlborg wrote: Just thinking out loud here. Would it be possible to lex a file in parallel? Cutting it in half (or similar) and lex both pieces simultaneously in parallel. Do you know where you can safely cut it without having it lexed beforehand? :)
Re: Request for comments: std.d.lexer
Am 31.01.2013 13:14, schrieb Jacob Carlborg: On 2013-01-27 10:51, Brian Schott wrote: I'm writing a D lexer for possible inclusion in Phobos. DDOC: http://hackerpilot.github.com/experimental/std_lexer/phobos/lexer.html Code: https://github.com/Hackerpilot/Dscanner/blob/range-based-lexer/std/d/lexer.d It's currently able to correctly syntax highlight all of Phobos, but does a fairly bad job at rejecting or notifying users/callers about invalid input. I'd like to hear arguments on the various ways to handle errors in the lexer. In a compiler it would be useful to throw an exception on finding something like a string literal that doesn't stop before EOF, but a text editor or IDE would probably want to be a bit more lenient. Maybe having it run-time (or compile-time configurable) like std.csv would be the best option here. I'm interested in ideas on the API design and other high-level issues at the moment. I don't consider this ready for inclusion. (The current module being reviewed for inclusion in Phobos is the new std.uni.) Just thinking out loud here. Would it be possible to lex a file in parallel? Cutting it in half (or similar) and lex both pieces simultaneously in parallel. why not only the symbols at the border needs to be connected then the question is: how many blocks(threads) are ok for 1,2,3,8 cores - can also depend on the speed of the filesystem
Re: Request for comments: std.d.lexer
On 2013-01-31 13:35, dennis luehring wrote: why not only the symbols at the border needs to be connected then the question is: how many blocks(threads) are ok for 1,2,3,8 cores - can also depend on the speed of the filesystem That would require some profiling to figure out. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
On 2013-01-31 13:34, FG wrote: Do you know where you can safely cut it without having it lexed beforehand? :) I was thinking that myself. It would probably be possible to just cut it in the middle and then lex a few characters forward and backwards until you get a valid token. Try and calculate the correct index where to cut. Although I have no idea how much trouble it would given you and how much you would gain. -- /Jacob Carlborg
Re: Request for comments: std.d.lexer
Am 31.01.2013 13:48, schrieb Jacob Carlborg: On 2013-01-31 13:35, dennis luehring wrote: why not only the symbols at the border needs to be connected then the question is: how many blocks(threads) are ok for 1,2,3,8 cores - can also depend on the speed of the filesystem That would require some profiling to figure out. i would say it can help alot so the design should be able to use split-parts and combine symboles at the border and also file-based lexing should be threadable so that 16 files can be handled by my 16 cores system full in parallel :) but its also size dependend - it makes no sense to split in too small parts, could be counter-productive
Re: Request for comments: std.d.lexer
On Thu, Jan 31, 2013 at 01:48:02PM +0100, Jacob Carlborg wrote: On 2013-01-31 13:34, FG wrote: Do you know where you can safely cut it without having it lexed beforehand? :) I was thinking that myself. It would probably be possible to just cut it in the middle and then lex a few characters forward and backwards until you get a valid token. Try and calculate the correct index where to cut. [...] Doesn't work if the middle happens to be inside a string literal containing code. Esp. a q{} literal (you wouldn't be able to tell where it starts/ends without scanning the entire file, because the {}'s nest). T -- Life is unfair. Ask too much from it, and it may decide you don't deserve what you have now either.