[Long enough that some should neither read nor comment on.] Mats raised an issue that I think does relate to how to tutor people in python.
The issue is learning how to take a PROBLEM to solve that looks massive and find ways to look at it as a series of steps where each step can be easily solved using available tools and techniques OR can recursively be decomposed into smaller parts that can. Many people learn to program without learning first how to write down several levels of requirements that spell out how each part of the overall result needs to look and finally how each part will be developed and tested. I worked in organizations with a division of labor to try to get this waterfall method in place. At times I would write higher-level architecture documents followed by Systems Engineering documents and Developer documents and Unit Test and System Test and even Field Support. The goal was to move from abstract to concrete so that the actual development was mainly writing fairly small functions, often used multiple times, and gluing them together. I looked back at the kind of tools used in UNIX and realize how limited they were relative to what is easily done in languages like python especially given a huge tool set you can import. The support for passing the output of one program to another made it easy to build pipelines. You can do that in python too but rarely need to. And I claim there are many easy ways to do things even better in python. Many UNIX tools were simple filters. One would read a file or two and pass through some of the lines, perhaps altered, to the standard output. The next process in the pipeline would often do the same, with a twist and sometimes new lines might even be added. The simple tools like cat and grep and sed and so on loosely fit the filter analogy. They worked on a line at a time, mostly. The more flexible tools like AWK and PERL are frankly more like Python than the simple tools. So if you had a similar task to do in python, is there really much difference? I claim not so much. Python has quite a few ways to do a filter. One simple one is a list comprehension and its relatives. Other variations are the map and filter functions and even reduce. Among other things, they can accept a list of lines of text and apply changes to them or just keep a subset or even calculate a result from them. Let me be concrete. You have a set of lines to process. You want to find all lines that pass through a gauntlet, perhaps with changes along the way. So assume you read an entire file (all at once at THIS point) into a list of lines. stuff = open(...).readlines() Condition 1 might be to keep only lines that had some word or pattern in them. You might have used sed or grep in the UNIX shell to specify a fixed string or pattern to search for. So in python, what might you do? Since stuff is a list, something like a list comprehension can handle many such needs. For a fixed string like "this" you can do something like this. stuff2 = [some_function(line) for line in stuff if some_condition(line)] The condition might be: "this" in line Or it might be a phrase than the line ends with something. Or it might be a regular expression type search. Or it might be the length is long enough or the number of words short enough. Every such condition can be some of the same things used in a UNIX pipeline or brand new ideas not available there like does a line translate into a set of numbers that are all prime! And, the function applied to what is kept can be to transform it to uppercase, or replace it with something else looked up in a dictionary and so on. You might even be able to apply multiple filters with each step. Python allows phrases like line.strip().upper() and conditions like: this or (that and not something_else) The point is a single line like the list comprehension above may already do what a pipeline of 8 simple commands in UNIX did, and more. Some of the other things UNIX tools did might involve taking a line and breaking it into chunks such as at a comma or tab or space and then keeping just the third and fifth and eighth but in reverse order. We sometimes used commands like cut or very brief AWK scripts to do that. Again, this can be trivial to do in python. Built in to character strings are functions that let you split a line like the above into a list of fields on a separator and perhaps rearrange and even rejoin them. In the above list comprehension method, if you are expecting eight regions that are comma separated >>> line1 = "f1,f2,f3,f4,f5,f6,f7,f8" >>> line2 = "g1,g2,g3,g4,g5,g6,g7,g8" >>> lines=[line1, line2] >>> splitsville = [line.split(',') for line in lines] >>> splitsville [['f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8'], ['g1', 'g2', 'g3', 'g4', 'g5', 'g6', 'g7', 'g8']] >>> items8_5_3 = [(h8, h5, h3) for (h1,h2,h3,h4,h5,h6,h7,h8) in splitsville] >>> items8_5_3 [('f8', 'f5', 'f3'), ('g8', 'g5', 'g3')] Or if you want them back as character with an underscore between: >>> items8_5_3 = ['_'.join([h8, h5, h3]) for (h1,h2,h3,h4,h5,h6,h7,h8) in splitsville] >>> items8_5_3 ['f8_f5_f3', 'g8_g5_g3'] The point is that we have oodles of little tools we can combine to solve bigger problems, sometimes in a big complicated mess and sometimes a simple step at a time. Not all can be easily chained the same way but then we have a bit more complex topic like generators and queues that can be chained together in ways even more complex than UNIX pipelines. Each generator would only be called to produce one result when another needs it. And I assume it might be possible to make a series of methods that are placed in an object that extends an object type like "list" that maintain an internal representation like a list of strings and changes it in place just like .sort() does. So to apply a UNIX-style pipeline may be as simple as: mystdout = mystdin("LIST OF STRINGS TO INITIALIZE).method1(args).method2(args)....methodn(args) The initializer will set the current "lines" and each method will loop on the lines and replace them with the output it wants. Perhaps the initializer or first method may actually read all lines from stdin. Perhaps the last method will write to stdout. All methods will effectively follow in sequence as they massage the data but will not actually run in parallel. And you can even write a generic method that accepts any external function designed to accept such a list of lines and return another to replace it. My point to Mats is that the goal is to learn to divide and conquer a problem. Using small and well defined methods that can fit together is great. Many things in python can be made to fit and some need work. Dumb example is that sorting something internally returns None and not the object itself. So you cannot chain something like object.upper().sort() any further. You may be able to chain this: >>> list(reversed(sorted(lines))).pop() 'f1,f2,f3,f4,f5,f6,f7,f8' Why the odd syntax? Because the developers of python in their wisdom may not have chosen to enhance some methods to do things another way. Object.sort() and Object.reverse() will change the internals and return nothing. They are not designed to be piped. If there was a method that performed a sort AND returned the object or performed a reverse and returned the object, then we might see: lines.sort(show=True).reverse(show=True).pop() Or some other similar stratagem. Then we could write a fairly complex sequence in a pipelined mode. -----Original Message----- From: Tutor <tutor-bounces+avigross=verizon....@python.org> On Behalf Of Mats Wichmann Sent: Tuesday, December 25, 2018 11:04 AM To: tutor@python.org Subject: Re: [Tutor] look back comprehensively On 12/24/18 5:45 PM, Avi Gross wrote: > As for the UNIX tools, one nice thing about them was using them in a > pipeline where each step made some modification and often that merely > allowed the next step to modify that. The solution did not depend on > one tool doing everything. I know we're wondering off topic here, but I miss the days when this philosophy was more prevalent - "do one thing well" and be prepared to pass your results on in a way that a different tool could potentially consume, doing its one thing well, and so on if needed. Of course equivalents of those old UNIX tools are still with us, mostly thanks to the GNU umbrella of projects, but so many current tools have grown so many capabilities they no longer can interact with with other tools in any sane way. "pipes and filters" seems destined to be constrained to the dustbin of tech history. I'll shut up now... _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor