On Friday 22 October 2010 15:29:36 William L. Thomson Jr. wrote:
> Long ago when taking a racing course, there was a saying;
> "A fast lap around the track is a slow lap in the cockpit"
> 
> The same holds true for programming. A quickly written program in a high
> level language, saves time from a programmers prospective. But likely
> makes any execution take longer and require more resources. I equate
> that to a quick lap in the cockpit, slow lap around the track. Quick to
> develop, slower to run, and much more people running it than developing
> it.

Hi William,

What you say is unassailably true on the face of it. Obviously an identical 
algorithm in C will be faster than in Python.

But in most cases, that and a dollar fifty gets you on the bus. It's usually 
not relevant. To your track/cockpit saying I'd like to add this saying by 
Donald Knuth:

===========================================
“We should forget about small efficiencies, say about 97% of the time: 
premature optimization is the root of all evil. Yet we should not pass up our 
opportunities in that critical 3%.A good programmer will not be lulled into 
complacency by such reasoning, he will be wise to look carefully at the 
critical code; but only after that code has been identified”
===========================================

Optimize what makes a difference, and only what makes a difference.

Non-users of my UMENU program (http://www.troubleshooters.com/umenu/) always 
gripe about the following:

1) For each menu choice it must read another file
2) It's written in Perl (also there's a ruby version) instead of C

But here's the facts. There's not a typist on earth who can outrun UMENU on a 
modern computer running Linux. From the typist's view, UMENU's response is 
instantaneous. So tell me one more time just so I understand -- why should I 
make it faster?

Now let me enumerate the advantages of Perl over C:

1) Built in easy to use regex, no additional libraries needed
2) Built in garbage collection
3) Arrays and hashes don't use pointers, so less risk of memory tromp
4) Arrays and hashes expand automatically, so less risk of buffer overflow
5) Use of Perl's very tested high level tools means you're less likely to have 
bugs or security flaws
6) Perl's faster development speed means you can spend more time thinking 
about your algorithm, and possibly creating one that's simpler and faster

Of course, when I had to make a prime number generator to generate billions of 
primes in a few minutes, I used C 
(http://www.troubleshooters.com/codecorn/primenumbers/primenumbers.htm). My 
mamma didn't raise no fool. But in the following situations I'll use Perl over 
C every time:

* Speed bottleneck is the user's response
* The process is done *rarely*, and the difference is 2 seconds vs 2 minutes
* A better, simpler, faster algorithm can be found
* For whatever reason, the Perl program's speed is "fast enough"

If a small part of the process takes the majority of the time, I might write 
that part in C and the rest in Perl.

I'd also like to point out that sometimes you can get the advantages of both. 
My home grown log http log file analysis program must do absolutely massive 
work, given that Troubleshooters.Com averages over 5K visits per day. So the 
first part of my program goes something like this:

====================================
cat `./logfilelist.cgi`                             | \
grep -v "\.png "                                    | \
grep -v "\.gif "                                    | \
grep -v "\.ico "                                    | \
grep -v "\.js H"                                    | \
grep -v "\.css"                                     | \
grep -v "index.cgi"                                 | \
grep " 200 "                                        | \
grep -v "\.jpg "                                    | \
grep "\"GET "                                       | \
grep -v "\.class "                                  | \
sed -e 's/ HTTP\/.*//'                              | \
sed -e 's/ - - \[/###/'                             | \
sed -e 's/ \"GET /###/'                             | \
sed -e 's/ -....]//'                                | \
sed -e 's+\(..\)/\(...\)/\(....\):+\3/\2/\...@+'      | \
sed -f  'months.sed'                                | \
./logeval_worker.cgi
====================================

So the first thing done is via a pipe sequence irrelevant lines are discarded, 
passing through only the relevant. This saves quite a bit of time and is 
performed by the Linux grep command, which is made from highly optimized and 
heavily tested C. The next thing I do is use sed, made of highly optimized and 
heavily tested C, to do simple data processing via regex. 

Then, I take that much smaller and pre-processed file, and send it to 
logeval_worker.cgi, which is a Perl program I wrote. logeval_worker.cgi does 
all the specialized data processing I can't do with Linux commands. It does 
the various accumulation tasks, break logic, program output, fixing malformed 
IP addresses, converting 3 letter months to 2 digit months, accounting for 
special URLs enumerated in a special URL file, and special events (got 
slashdotted, added info, etc). It's a big, complex and featureful program that 
would have taken serious time to write in C, and I wanted to have it done, so 
I did it in Perl.

I haven't run this for a long time because my web host made it hard to ftp 
down raw logs, but as I remember this took about 10 minutes for a couple 
months worth of log, which was acceptable to me. But if it hadn't been 
acceptable, I then could have either optimized the algorithm or rewritten part 
or all in C, after seeing how well it ran.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US
Twitter: http://www.twitter.com/stevelitt


---------------------------------------------------------------------
Archive      http://marc.info/?l=jaxlug-list&r=1&w=2
RSS Feed     http://www.mail-archive.com/[email protected]/maillist.xml
Unsubscribe  [email protected]

Reply via email to