Am 25.03.21 um 00:30 schrieb Avi Gross:
It [awk] is, as noted, a great tool and if you only had one or a few tools like 
it
available, it can easily be bent and twisted to do much of what the others
do as it is more programmable than most. But following that line of
reasoning, fairly simple python scripts can be written with python -c "..."
or by pointing to a script

The thing with awk is that lots of useful text processing is directly built into the main syntax; whereas in Python, you can certainly do it as well, but it requires to load a library. The simple column summation mentioned before by Cameron would be

   awk ' {sum += $2 } END {print sum}'

which can be easily typed into a command line, with the benefit that it skips every line where the 2nd col is not a valid number. This is important because often there are empty lines, often there is an empty line at the end, some ascii headers whatever.

The closest equivalent I can come up with in Python is this:

==============================
import sys

s=0
for line in sys.stdin:
    try:
        s += float(line.split()[1])
    except:
        pass
print(s)
===================================


I don't want to cram this into a python -c " " line, if it even is possible; how do you handle indentation levels and loops??

Of course, for big fancy programs Python is a much better choice than awk, no questions asked - but awk has a place for little things which fit the special programming model, and there are surprisingly many applications where this is just the easiest and fastest way to do the job.

It's like regexes - a few simple characters can do the job which otherwise requires a bulky program, but once the parsing gets to certain complexity, a true parsing language, or even just handcoded Python is much more maintainable.

        Christian

PS: Exercise - handle lines commented out with a '#', i.e. skip those. In awk:

gawk '!/^\s*#/ {sum += $2 } END {print sum}'

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to