On 27May2014 15:27, Degreat Yartey <yarteydegre...@gmail.com> wrote:
I am studying python on my own (i.e. i am between the beginner and
intermediate level) and i haven't met any difficulty until i reached the
topic 'Generators and Iterators'.
I need an explanation so simple as using the expression 'print ()', in this
case 'yield'.
Python 2.6 here!
Thank you.

Generators are functions that do a bit of work and then yield a value, then a bit more and so on. This means that you "call" them once. What you get back is an iterator, not the normal function return value.

Whenever you use the iterator, the generator function runs until it hits a "yield" statement, and the value in theyield statement is what you get for that iteration. Next time you iterate, the function runs a bit more, until it yields again, or returns (end of function, and that causes end of iteration).

So the function doesn't even run until you ask for a value, and then it only runs long enough to find the next value.

Example (all code illstrative only, untested):

Suppose you need to process every second line of a file.

You might write it directly like this:

  def munge_lines(fp):
    ''' Do stuff with every second line of the already-open file `fp`.
    '''
    lineno = 0
    for line in fp:
      lineno += 1
      if lineno % 2 == 0:
        print lineno, line,

That should read lines from the file and print every second one with the line number.

Now suppose you want something more complex than "every second line", especially something that requires keeping track of some state. In the example above you only need the line number, and using it still consumes 2 of the 3 lines in the loop body.

A more common example might be "lines between two markers".

The more of that you embed in the "munge_lines" function, the more it will get in the way of seeing what the function actually does.

So a reasonable thing might be to write a function that gets the requested lines:

  def wanted_lines(fp):
    wanted = []
    between = False
    for line in fp:
      if between:
        if 'end_marker' in line:
          between = False
        else:
          wanted.append(line)
      elif 'start_maker' in line:
        between = True
    return wanted

This reads the whole file and returns a line of the wanted lines, and "munge_lines: might then look like this:

  for line in wanted_lines(fp):
    print line

However:

  - that reads the whole file before returning anything

  - has to keep all the lines in the list "wanted"

Slow in response, heavy in memory cost, and unworkable if "fp" actually doesn't end (eg reading from a terminal, or a pipeline, or...)

What you'd really like is to get each line as needed.

We can rewrite "wanted_lines" as a generator:

  def wanted_lines(fp):
    between = False
    for line in fp:
      if between:
        if 'end_marker' in line:
          between = False
        else:
          yield line
      elif 'start_maker' in line:
        between = True

All we've done is used "yield" instead of the "append" and removed the "wanted" list and the return statement. The calling code is the same.

To see the difference, put a "print" in "wanted_lines" as the first line of the for loop. With the "list" version you will see all the prints run before you get the array back. With the generator you will see the print run just before each value you get back.

Cheers,
Cameron Simpson <c...@zip.com.au>
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to