[ On , April 26, 2000 at 18:41:32 (+0200), Akim Demaille wrote: ]
> Subject: Performances of awk
>
> What's so expensive is the finalizing loop which uses a small AWK
> script.  If you concentrate the measure on this very script, the
> performance penalty is frightening:
> 
> | ~/src/fileutils % cat > /tmp/finalize.awk
> |     {
> |       sub(/[         ]*$/, "")
> |       if ($0 == "")
> |         {
> |           if (!duplicate)
> |             print
> |           duplicate = 1
> |           next
> |         }
> |       duplicate = 0
> |       oline++
> |       while (sub(/__oline__/, oline))
> |         continue
> |       while (sub(/@<:@/, "["))
> |         continue
> |       while (sub(/@:>@/, "]"))
> |         continue
> |       while (sub(/@S\|@/, "$"))
> |         continue
> |       while (sub(/@%:@/, "#"))
> |         continue
> |       print
> |     }
> | ~/src/fileutils % time mawk -f /tmp/finalize.awk < configure >/dev/null
> |   9,51s user 0,02s system 100% cpu 9,521 total
> | ~/src/fileutils % time gawk -f /tmp/finalize.awk < configure >/dev/null
> |   0,89s user 0,01s system 101% cpu 0,890 total
> 
> So, should we change AC_PROG_AWK?  Should the package Autoconf use a
> different macro?

I don't know exactly what's going on here, but I am sure that mawk is
usually a good choice and sometimes the best choice still:

17:12 [104] $ time nawk -f /home/most/woods/src/finalize.awk < gmp/configure >>
    0.28s real     0.15s user     0.11s system
17:12 [105] $ time mawk -f /home/most/woods/src/finalize.awk < gmp/configure >>
    0.17s real     0.13s user     0.03s system
17:12 [106] $ time gawk -f /home/most/woods/src/finalize.awk < gmp/configure >>
    0.33s real     0.27s user     0.05s system

17:13 [107] $ time nawk -f /home/most/woods/src/finalize.awk < ggrep/configure>
    0.22s real     0.09s user     0.08s system
17:14 [108] $ time mawk -f /home/most/woods/src/finalize.awk < ggrep/configure>
    0.08s real     0.05s user     0.01s system
17:14 [109] $ time gawk -f /home/most/woods/src/finalize.awk < ggrep/configure>
    0.13s real     0.10s user     0.02s system

17:14 [110] $ time nawk -f /home/most/woods/src/finalize.awk < newsyslog/confi>
    0.44s real     0.20s user     0.20s system
17:14 [111] $ time mawk -f /home/most/woods/src/finalize.awk < newsyslog/confi>
    0.37s real     0.35s user     0.01s system
17:14 [112] $ time gawk -f /home/most/woods/src/finalize.awk < newsyslog/confi>
    0.32s real     0.30s user     0.01s system

17:19 [132] $ time nawk -f /home/most/woods/src/finalize.awk < fingerd/configu>
    0.53s real     0.22s user     0.24s system
17:19 [133] $ time mawk -f /home/most/woods/src/finalize.awk < fingerd/configu>
    0.55s real     0.40s user     0.02s system
17:19 [134] $ time gawk -f /home/most/woods/src/finalize.awk < fingerd/configu>
    0.40s real     0.32s user     0.03s system

Only this last one (which is the largest configure script I had handy)
showed mawk wasting time beyond all reason:

17:21 [136] $ time nawk -f /home/most/woods/src/finalize.awk < amanda/configur>
    3.57s real     1.94s user     1.53s system
17:21 [137] $ time mawk -f /home/most/woods/src/finalize.awk < amanda/configur>
   54.65s real    52.90s user     0.28s system
17:22 [138] $ time gawk -f /home/most/woods/src/finalize.awk < amanda/configur>
    2.46s real     2.29s user     0.12s system

So, there seem to be some odd-ball cases where mawk is dramatically
slower than either nawk or gawk.  Note that the above timings were done
with mawk-1.2.2.  The current version is 1.3.3 so perhaps even the
odd-ball cases will work faster with it.

(what's interesting in my timings is the excessive system time that nawk
always seems to take!  ;-)

Now I do have to ask what the purpose of that awk script could possibly
be, and why it has to do things the way it seems to want to do them?

-- 
                                                        Greg A. Woods

+1 416 218-0098      VE3TCP      <[EMAIL PROTECTED]>      <robohack!woods>
Planix, Inc. <[EMAIL PROTECTED]>; Secrets of the Weird <[EMAIL PROTECTED]>

Reply via email to