At 10:01 AM +0100 8/23/08, Matthew Seaman wrote: >Walt Pawley wrote: >> >> At the risk of beating this to death, I just happened to >> stumble on a real world example of why one might want to use >> Perl for sed-ly stuff. >> ... snip ... >> wump$ ls -l Desktop/klog >> -rw-r--r-- 1 wump 1001 52753322 22 Aug 16:37 Desktop/klog >> wump$ time sed "s/ .*//" Desktop/klog > kadr1 >> >> real 0m10.800s >> user 0m10.580s >> sys 0m0.250s >> wump$ time perl -pe 's/ .*//' Desktop/klog > kadr2 >> >> real 0m0.975s >> user 0m0.700s >> sys 0m0.270s >> wump$ cmp kadr1 kadr2 >> wump$ >> >> Why disparity in execution speed? ... > >Careful now. Have you accounted for the effect of the klog file >being cached in VM rather than having to be read afresh from disk? >It makes a very big difference in how fast it is processed.
No, I hadn't done any such accounting. So, wrote a little script you can surmise from the following output: wump$ sh -v spdtst time perl -pe 's/ .*//' Desktop/klog > /dev/null real 0m0.961s user 0m0.740s sys 0m0.230s time sed "s/ .*//" Desktop/klog > /dev/null real 0m10.506s user 0m10.270s sys 0m0.250s time awk '{print $1}' Desktop/klog > /dev/null real 0m2.333s user 0m2.140s sys 0m0.180s time sed "s/ .*//" Desktop/klog > /dev/null real 0m10.489s user 0m10.250s sys 0m0.230s time perl -pe 's/ .*//' Desktop/klog > /dev/null real 0m0.799s user 0m0.580s sys 0m0.220s >In order to get meaningful data for this sort of test you should >do a dummy run or two of each command in fairly quick succession, >and then repeat your test runs a number of times and look at the >average and standard deviation of the execution times. ... Yeah, Hoyle would like that. But for me, I think the results are clear enough without all the messing with statistical computations. 10 to 1 or better is good enough for me to think there's some major difference. That said, it would appear that caching can make a difference - which is why I put the Perl invocation first ... so it would be running without the benefit of caching. But I don't believe I was entirely successful in that effort. The very first time I ran this, which was also the very first time in a whole day that the klog file had been accessed, the first Perl invocation took about 2 seconds of real time and still only 0.7 seconds of user time. I don't believe caching explains the execution speed disparity. It was mentioned that this function is made for awk, so I tried that as well. It is also evidently not as quick as Perl at doing the job. The time shown above is quite consistent with a number of other runs I've tried with awk. I suspect a real Perl internals maven could explain this. I have some ideas but they're conjecture. Perhaps some effort to improve execution efficiency in sed and awk would not be wasted? -- Walter M. Pawley <[EMAIL PROTECTED]> Wump Research & Company 676 River Bend Road, Roseburg, OR 97470 541-672-8975 _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"