Bill Moseley wrote:
At 12:00 AM 01/27/02 +0800, Stas Bekman wrote:
so we have about 3MB of source code in 134 files (and will be more likely 6MB, when 2.0 docs are done, with 200+ files). Do you think it's possible to grep through in a reasonable response time? Remember that there will be a lot of IO for opening and closing many files.
It's not like mod_perl is a high volume site. And it's running on a lot faster machine than my machine:
~/modperl-docs > find src -name '*.pod' | wc -l 105
~/modperl-docs > time find src -name '*.pod' | xargs fgrep '$|' | wc -l 23
real 0m0.033s user 0m0.030s sys 0m0.010s
That seems reasonable enough, even if it was ten times slower.
Hmm, you were trying this on uloaded machine, right? If you have many parallel searches and other tasks running this can be much much slower, no?
Also remember that user doesn't care about CPU clocks, but elapsed wallclock.
Also which OS/distro are you running this at? how time gets through the pipe? It doesn't work for me. If I try:
time find src -name '*.pod' -exec fgrep -l '$|' {} \;
src/docs/2.0/devel/debug_c/debug_c.pod
src/docs/2.0/devel/testing/testing.pod
src/docs/1.0/faqs/cgi_to_mod_perl.pod
src/docs/1.0/guide/control.pod
src/docs/1.0/guide/debug.pod
src/docs/1.0/guide/perl.pod
src/docs/1.0/guide/performance.pod
src/docs/1.0/guide/porting.pod
src/docs/1.0/guide/scenario.pod
src/docs/1.0/guide.good/control.pod
src/docs/1.0/guide.good/debug.pod
src/docs/1.0/guide.good/perl.pod
src/docs/1.0/guide.good/performance.pod
src/docs/1.0/guide.good/porting.pod
src/docs/1.0/guide.good/scenario.pod
0.120u 0.170s 0:00.31 93.5% 0+0k 0+0io 18193pf+0was you can see it's much slower.
I think that Randy's setup was quite satisfying, but nextrieve was even better. What do you think about nextrieve?All the reverse indexing engines will parse on indexing, so it will always be an issue of defining what makes up a word.
Let me ask Avi Rappoport if there's something good for searching code.
I don't know much about it. It's not open source, and it's not free. I really doubt it integrates with Template Toolkit.
Ah, OK. I didn't know that.
Could we feed the pod source into Parse::RecDescent and get it to tokenize perl code? That would be more fun.
I guess so, but from what I know, Parse::RecDescent is not good for real-time processing because it's very slow. Rememember that it stores the parsed tree using Perl datastructures, which is very ineffective. I don't know if it was rewritten to use C datastructures since last year.
_____________________________________________________________________ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
