Re: dmd 1.060 and 2.045 release
Walter Bright wrote: Walter Bright wrote: I recompiled dmd with the profiler (-gt switch) which confirmed it. For those interested, try out changeset 470. On my timing tests, the time spent is linear with the number of characters in the identifier. It's still too slow, though.
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
On 05/07/2010 01:36 AM, Robert Clipsham wrote: I love it! I don't suppose you have a guide for how to get it set up and working in vim do you? I've never managed to get ctags working, even with C/C++ :/ The only tag-related configuration in my Vim setup is: set tags=./tags; in ~/.vimrc (Note the semicolon: it makes Vim look for tags file upward to the root directory).
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
"Ali Çehreli" wrote in message news:hs0332$cd...@digitalmars.com... > MIURA Masahiro wrote: > >> I'm considering an enhancement: "d2tags " reads >> all JSON files in the directory. However I'm not sure if it >> should recurse into subdirectories. > > I think simpler is better. There are already tools like find on all Linux > shells that could do the recursion. > I'd say "d2tags " is a lot simpler than figuring out Linux's find and combining it with d2tags.
Re: dmd 1.060 and 2.045 release
On 7-5-2010 9:10, Brad Roberts wrote: > On Fri, 7 May 2010, Lionello Lunesu wrote: > >> On 6-5-2010 22:37, Michel Fortin wrote: >>> On 2010-05-05 23:45:50 -0400, Walter Bright >>> said: >>> Walter Bright wrote: > Alex Makhotin wrote: >> It takes ~40 seconds 50% load on the dual core processor(CentOS 5.3 >> kernel 2.6.32.4), to get the actual error messages about the >> undefined identifier. > > Definitely there's a problem. The problem is the spell checker is O(n*n) on the number of characters in the undefined identifier. >>> >>> That's an algorithm that can't scale then. >>> >>> Checking the Levenshtein distance for each known identifier within a >>> small difference in length would be a better idea. (Clang is said to use >>> the Levenshtein distance, it probably does something of the sort.) >>> >>> http://en.wikipedia.org/wiki/Levenshtein_distance >>> >> and especially this line: >> >> # If we are only interested in the distance if it is smaller than a >> threshold k, then it suffices to compute a diagonal stripe of width 2k+1 >> in the matrix. In this way, the algorithm can be run in O(kl) time, >> where l is the length of the shortest string. > > The source for this is pretty isolated.. anyone want to volunteer take a > shot at improving this part of dmd? > > Later, > Brad I see, speller.c, I'll have a look..
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
MIURA Masahiro wrote: I'm considering an enhancement: "d2tags " reads all JSON files in the directory. However I'm not sure if it should recurse into subdirectories. I think simpler is better. There are already tools like find on all Linux shells that could do the recursion. Ali
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
On 05/07/2010 01:48 AM, Andrei Alexandrescu wrote: > I wonder if this is of enough general utility to warrant inclusion > within the D distribution, along with rdmd. Thoughts? That's my pleasure, actually! > One small suggestion, Masahiro: you may want to replace the file reading > loop in main() with simply std.file.readText(args[1]). Done, thank you for the advice. I'm considering an enhancement: "d2tags " reads all JSON files in the directory. However I'm not sure if it should recurse into subdirectories.
Re: dmd 1.060 and 2.045 release
Walter Bright wrote: I recompiled dmd with the profiler (-gt switch) which confirmed it. For those interested, try out changeset 470.
Re: dmd 1.060 and 2.045 release
On Fri, 7 May 2010, Lionello Lunesu wrote: > On 6-5-2010 22:37, Michel Fortin wrote: > > On 2010-05-05 23:45:50 -0400, Walter Bright > > said: > > > >> Walter Bright wrote: > >>> Alex Makhotin wrote: > It takes ~40 seconds 50% load on the dual core processor(CentOS 5.3 > kernel 2.6.32.4), to get the actual error messages about the > undefined identifier. > >>> > >>> Definitely there's a problem. > >> > >> The problem is the spell checker is O(n*n) on the number of characters > >> in the undefined identifier. > > > > That's an algorithm that can't scale then. > > > > Checking the Levenshtein distance for each known identifier within a > > small difference in length would be a better idea. (Clang is said to use > > the Levenshtein distance, it probably does something of the sort.) > > > > http://en.wikipedia.org/wiki/Levenshtein_distance > > > and especially this line: > > # If we are only interested in the distance if it is smaller than a > threshold k, then it suffices to compute a diagonal stripe of width 2k+1 > in the matrix. In this way, the algorithm can be run in O(kl) time, > where l is the length of the shortest string. The source for this is pretty isolated.. anyone want to volunteer take a shot at improving this part of dmd? Later, Brad
Re: dmd 1.060 and 2.045 release
On 6-5-2010 22:37, Michel Fortin wrote: > On 2010-05-05 23:45:50 -0400, Walter Bright > said: > >> Walter Bright wrote: >>> Alex Makhotin wrote: It takes ~40 seconds 50% load on the dual core processor(CentOS 5.3 kernel 2.6.32.4), to get the actual error messages about the undefined identifier. >>> >>> Definitely there's a problem. >> >> The problem is the spell checker is O(n*n) on the number of characters >> in the undefined identifier. > > That's an algorithm that can't scale then. > > Checking the Levenshtein distance for each known identifier within a > small difference in length would be a better idea. (Clang is said to use > the Levenshtein distance, it probably does something of the sort.) > > http://en.wikipedia.org/wiki/Levenshtein_distance > and especially this line: # If we are only interested in the distance if it is smaller than a threshold k, then it suffices to compute a diagonal stripe of width 2k+1 in the matrix. In this way, the algorithm can be run in O(kl) time, where l is the length of the shortest string.
Re: dmd 1.060 and 2.045 release
Steven Schveighoffer wrote: On Thu, 06 May 2010 17:07:12 -0400, Walter Bright wrote: Steven Schveighoffer wrote: That can't be it. The identifier shown by Alex is only 33 characters. O(n^2) is not that slow, especially for smaller variables. There must be other factors you're not considering... I recompiled dmd with the profiler (-gt switch) which confirmed it. So a single unknown symbol (from Alex's example) which can be checked against each existing symbol in O(n^2) time, takes 40 seconds on a decent CPU? How many other symbols are there? 33^2 == 1089, if there are 1 symbols, that's 10 million iterations, that shouldn't take 40 seconds to run, should it? Are there more symbols to compare against? Do you use heuristics to prune the search? For example, if the max distance is 2, and the difference in length between two strings is >2, you should be able to return immediately. Check out the profiler output. It's clearly the vast number of calls to the symbol lookup, not the time spent in each call. - Num TreeFuncPer CallsTimeTimeCall 50409318 632285778 145858160 2 Dsymbol *syscall ScopeDsymbol::search(Loc ,Identifier *,int ) 50411264 131394915 106356855 2 void **syscall StringTable::search(char const *,unsigned ) 50409329 341960075 105532978 2 Dsymbol *syscall DsymbolTable::lookup(Identifier *) 50409329 236427096 105037393 2 StringValue *syscall StringTable::lookup(char const *,unsigned ) 12602340 61389061967393753 5 Dsymbol *syscall Scope::search(Loc ,Identifier *,Dsymbol **) 12602178 69391519766918360 5 void *cdecl scope_search_fp(void *,char const *) 25204505 46135292052529164 2 Dsymbol *syscall Module::search(Loc ,Identifier *,int ) 504121372503847425038474 0 unsigned cdecl Dchar::calcHash(char const *,unsigned ) 3520 1428323068203493755781 void *cdecl spellerX(char const *,void *cdecl (*)(void *,char const *),void *,char const *,int ) 12602664 6811916 6811916 0 syscall Identifier::Identifier(char const *,int ) 12602178 6299089 6299089 0 void cdecl Module::clearCache() 12602183 6151175 6151175 0 Module *syscall Module::isModule() 1600 113294261 2 StringValue *syscall StringTable::update(char const *,unsigned ) -
Re: dmd 1.060 and 2.045 release
On Thu, 06 May 2010 17:07:12 -0400, Walter Bright wrote: Steven Schveighoffer wrote: That can't be it. The identifier shown by Alex is only 33 characters. O(n^2) is not that slow, especially for smaller variables. There must be other factors you're not considering... I recompiled dmd with the profiler (-gt switch) which confirmed it. So a single unknown symbol (from Alex's example) which can be checked against each existing symbol in O(n^2) time, takes 40 seconds on a decent CPU? How many other symbols are there? 33^2 == 1089, if there are 1 symbols, that's 10 million iterations, that shouldn't take 40 seconds to run, should it? Are there more symbols to compare against? Do you use heuristics to prune the search? For example, if the max distance is 2, and the difference in length between two strings is >2, you should be able to return immediately. -Steve
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
On 07/05/10 06:30, Lutger wrote: Yes it's very useful. How about also including the source in the examples directory? That's a good idea, seeing as most of the examples are either for Windows, or outdated.
Re: dmd 1.060 and 2.045 release
Steven Schveighoffer wrote: That can't be it. The identifier shown by Alex is only 33 characters. O(n^2) is not that slow, especially for smaller variables. There must be other factors you're not considering... I recompiled dmd with the profiler (-gt switch) which confirmed it.
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
Andrei Alexandrescu wrote: > MIURA Masahiro wrote: >> Hi, >> >> Being happy to see issue 3415 (broken JSON format) fixed, >> I have written a utility to convert DMD2's JSON output >> to Exuberent Ctags format. This enables you to tagjump in Vim >> and other editors/IDEs. It's just 150+ lines, thanks to D2's >> powerful string handling. Enjoy! >> >> http://github.com/Dubhead/d2tags >> >> usage: >> % dmd -Xftags.json foo.d >> % d2tags tags.json > tags > > Very useful, and a beautiful example of D scripting. > > I wonder if this is of enough general utility to warrant inclusion > within the D distribution, along with rdmd. Thoughts? Yes it's very useful. How about also including the source in the examples directory?
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
On Fri, 07 May 2010 01:48:59 +0900, Andrei Alexandrescu wrote: MIURA Masahiro wrote: Hi, Being happy to see issue 3415 (broken JSON format) fixed, I have written a utility to convert DMD2's JSON output to Exuberent Ctags format. This enables you to tagjump in Vim and other editors/IDEs. It's just 150+ lines, thanks to D2's powerful string handling. Enjoy! http://github.com/Dubhead/d2tags usage: % dmd -Xftags.json foo.d % d2tags tags.json > tags Very useful, and a beautiful example of D scripting. I wonder if this is of enough general utility to warrant inclusion within the D distribution, along with rdmd. Thoughts? vote++
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
Andrei Alexandrescu, el 6 de mayo a las 09:48 me escribiste: > MIURA Masahiro wrote: > > Hi, > > > > Being happy to see issue 3415 (broken JSON format) fixed, > > I have written a utility to convert DMD2's JSON output > > to Exuberent Ctags format. This enables you to tagjump in Vim > > and other editors/IDEs. It's just 150+ lines, thanks to D2's > > powerful string handling. Enjoy! > > > > http://github.com/Dubhead/d2tags > > > > usage: > > % dmd -Xftags.json foo.d > > % d2tags tags.json > tags > > Very useful, and a beautiful example of D scripting. > > I wonder if this is of enough general utility to warrant inclusion > within the D distribution, along with rdmd. Thoughts? I think it might be better to add support to the common tools, like exuberant-ctags[1], but having it as part of rdmd or whatever could be nice too. [1] http://ctags.sourceforge.net/ -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ -- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) -- A lo que Peperino respondióles: aquel que tenga sabañones que se los moje, aquel que padece calvicie no padece un osito, no es bueno comer lechón en día de gastritis, no mezcleis el vino con la sandía, sacad la basura después de las ocho, en caso de emergencia rompa el vidrio con el martillo, a cien metros desvio por Pavón. -- Peperino Pómoro
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
Pelle wrote: On 05/06/2010 06:48 PM, Andrei Alexandrescu wrote: I wonder if this is of enough general utility to warrant inclusion within the D distribution, along with rdmd. Thoughts? Yes please, rdmd --tags would be great. I was thinking of including the utility as a separate program. Andrei
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
On 05/06/2010 06:48 PM, Andrei Alexandrescu wrote: I wonder if this is of enough general utility to warrant inclusion within the D distribution, along with rdmd. Thoughts? Yes please, rdmd --tags would be great.
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
MIURA Masahiro wrote: > Hi, > > Being happy to see issue 3415 (broken JSON format) fixed, > I have written a utility to convert DMD2's JSON output > to Exuberent Ctags format. This enables you to tagjump in Vim > and other editors/IDEs. It's just 150+ lines, thanks to D2's > powerful string handling. Enjoy! > > http://github.com/Dubhead/d2tags > > usage: > % dmd -Xftags.json foo.d > % d2tags tags.json > tags Very useful, and a beautiful example of D scripting. I wonder if this is of enough general utility to warrant inclusion within the D distribution, along with rdmd. Thoughts? One small suggestion, Masahiro: you may want to replace the file reading loop in main() with simply std.file.readText(args[1]). Andrei
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
On 06/05/10 11:46, MIURA Masahiro wrote: Hi, Being happy to see issue 3415 (broken JSON format) fixed, I have written a utility to convert DMD2's JSON output to Exuberent Ctags format. This enables you to tagjump in Vim and other editors/IDEs. It's just 150+ lines, thanks to D2's powerful string handling. Enjoy! http://github.com/Dubhead/d2tags usage: % dmd -Xftags.json foo.d % d2tags tags.json> tags I love it! I don't suppose you have a guide for how to get it set up and working in vim do you? I've never managed to get ctags working, even with C/C++ :/
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
MIURA Masahiro, el 6 de mayo a las 19:46 me escribiste: > Hi, > > Being happy to see issue 3415 (broken JSON format) fixed, > I have written a utility to convert DMD2's JSON output > to Exuberent Ctags format. This enables you to tagjump in Vim > and other editors/IDEs. It's just 150+ lines, thanks to D2's > powerful string handling. Enjoy! > > http://github.com/Dubhead/d2tags > > usage: > % dmd -Xftags.json foo.d > % d2tags tags.json > tags % vim -t tags foo.d Great! -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ -- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) -- Que barbaridad, este país se va cada ves más pa' tras, más pa' tras... -- Sidharta Kiwi
Re: dmd 1.060 and 2.045 release
Steven Schveighoffer, el 6 de mayo a las 07:17 me escribiste: > On Wed, 05 May 2010 23:45:50 -0400, Walter Bright > wrote: > > >Walter Bright wrote: > >>Alex Makhotin wrote: > >>>It takes ~40 seconds 50% load on the dual core > >>>processor(CentOS 5.3 kernel 2.6.32.4), to get the actual error > >>>messages about the undefined identifier. > >> Definitely there's a problem. > > > >The problem is the spell checker is O(n*n) on the number of > >characters in the undefined identifier. > > That can't be it. The identifier shown by Alex is only 33 > characters. O(n^2) is not that slow, especially for smaller > variables. There must be other factors you're not considering... Run a profiler. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ -- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) -- No existiría el sonido del mar si faltara en la vida oreja y caracol. -- Ricardo Vaporeso. Cosquín, 1908.
Re: dmd 1.060 and 2.045 release
On 2010-05-05 23:45:50 -0400, Walter Bright said: Walter Bright wrote: Alex Makhotin wrote: It takes ~40 seconds 50% load on the dual core processor(CentOS 5.3 kernel 2.6.32.4), to get the actual error messages about the undefined identifier. Definitely there's a problem. The problem is the spell checker is O(n*n) on the number of characters in the undefined identifier. That's an algorithm that can't scale then. Checking the Levenshtein distance for each known identifier within a small difference in length would be a better idea. (Clang is said to use the Levenshtein distance, it probably does something of the sort.) http://en.wikipedia.org/wiki/Levenshtein_distance -- Michel Fortin michel.for...@michelf.com http://michelf.com/
Re: dmd 1.060 and 2.045 release
Hello Walter, Walter Bright wrote: Alex Makhotin wrote: It takes ~40 seconds 50% load on the dual core processor(CentOS 5.3 kernel 2.6.32.4), to get the actual error messages about the undefined identifier. Definitely there's a problem. The problem is the spell checker is O(n*n) on the number of characters in the undefined identifier. How about switch algos for long identifiers: you could bucket the knows by length and compare histograms on things of similar length. Or maybe just turn it off for long names. -- ... <
Re: d2tags - converts DMD2's JSON output to Exuberant Ctags format
On 06/05/10 22:46, MIURA Masahiro wrote: > Hi, > > Being happy to see issue 3415 (broken JSON format) fixed, > I have written a utility to convert DMD2's JSON output > to Exuberent Ctags format. This enables you to tagjump in Vim > and other editors/IDEs. It's just 150+ lines, thanks to D2's > powerful string handling. Enjoy! > > http://github.com/Dubhead/d2tags > > usage: > % dmd -Xftags.json foo.d > % d2tags tags.json> tags Awesome!
Re: dmd 1.060 and 2.045 release
On Wed, 05 May 2010 23:45:50 -0400, Walter Bright wrote: Walter Bright wrote: Alex Makhotin wrote: It takes ~40 seconds 50% load on the dual core processor(CentOS 5.3 kernel 2.6.32.4), to get the actual error messages about the undefined identifier. Definitely there's a problem. The problem is the spell checker is O(n*n) on the number of characters in the undefined identifier. That can't be it. The identifier shown by Alex is only 33 characters. O(n^2) is not that slow, especially for smaller variables. There must be other factors you're not considering... -Steve
d2tags - converts DMD2's JSON output to Exuberant Ctags format
Hi, Being happy to see issue 3415 (broken JSON format) fixed, I have written a utility to convert DMD2's JSON output to Exuberent Ctags format. This enables you to tagjump in Vim and other editors/IDEs. It's just 150+ lines, thanks to D2's powerful string handling. Enjoy! http://github.com/Dubhead/d2tags usage: % dmd -Xftags.json foo.d % d2tags tags.json > tags