Sorry it took so long to respond, just getting back from the holidays. You
all have given me much to think about. I've read all the messages through
once, now I need to go trough them again and try to apply the ideas. I'll
be posting other questions as I run into problems. BTW, Danny, best
explanation of generators I've heard, well done and thank you.

regards, Richard

On Thu, Dec 24, 2015 at 4:54 PM, Danny Yoo <d...@hashcollision.org> wrote:

> > I think what I need to do would be analogous to (pardon if I'm using the
> > wrong terminology, at this poing in the discussion I am officially out of
> > my depth) sending the input stream to a buffer(s) until  the ETX for that
> > message comes in, shoot the buffer contents to the parser while accepting
> > the next STX + message fragment into the buffer, or something analogous.
>
> Yes, I agree.  It sounds like you have one process read the socket and
> collect chunks of bytes delimited by the STX markers.  It can then
> send those chunks to the XML parser.
>
>
> We can imagine one process that reads the socket and spits out a list
> of byte chunks:
>
>     chunks = readDelimitedChunks(socket)
>
> and another process that parses those chunks and does something with them:
>
>     for chunk in chunks:
>         ....
>
>
> It would be nice if we could organize the program like this.  But one
> problem is that chunks might not be finite!  The socket might keep on
> returning bytes.  If it keeps returning bytes, we can't possibly
> return a finite list of the chunked bytes.
>
>
> What we really want is something like:
>
>     chunkStream = readDelimitedChunks(socket)
>     for chunk in chunkStream:
>         ....
>
> where chunkStream is itself like a socket: it should be something that
> we can repeatedly read from as if it were potentially infinite.
>
>
> We can actually do this, and it isn't too bad.  There's a mechanism in
> Python called a generator that allows us to write function-like things
> that consume streams of input and produce streams of output.  Here's a
> brief introduction to them.
>
> For example, here's a generator that knows how to produce an infinite
> stream of numbers:
>
> ##############
> def nums():
>     n = 0
>     while True:
>         yield n
>         n += 1
> ##############
>
> What distinguishes a generator from a regular function?  The use of
> "yield".  A "yield" is like a return, but rather than completely
> escape out of the function with the return value, this generator will
> remember what it was doing  at that time.  Why?  Because it can
> *resume* itself when we try to get another value out of the generator.
>
> Let's try it out:
>
> #####################
>
> >>> numStream = nums()
> >>> numStream.next()
> 0
> >>> numStream.next()
> 1
> >>> numStream.next()
> 2
> >>> numStream.next()
> 3
> >>> numStream.next()
> 4
> #####################
>
> Every next() we call on a generator will restart it from where it left
> off, until it reaches its next "yield".  That's how we get this
> generator to return an infinite sequence of things.
>
>
> That's how we produce infinite sequences.  And we can write another
> generator that knows how to take a stream of numbers, and square each
> one.
>
> ########################
> def squaring(stream):
>     for n in stream:
>         yield n
> ########################
>
>
> Let's try it.
>
>
> ########################
>
> >>> numStream = nums()
> >>> squaredNums = squaring(numStream)
> >>> squaredNums.next()
> 0
> >>> squaredNums.next()
> 1
> >>> squaredNums.next()
> 4
> >>> squaredNums.next()
> 9
> >>> squaredNums.next()
> 16
> ########################
>
>
> If you have experience with other programming languages, you may have
> heard of the term "co-routine".  What we're doing with this should be
> reminiscent of coroutine-style programming.  We have one generator
> feeding input into the other, with program control bouncing back and
> forth between the generators as necessary.
>
>
> So that's a basic idea of generators.  It lets us write processes that
> can deal with and produce streams of data.  In the context of sockets,
> this is particularly helpful, because sockets can be considered a
> stream of bytes.
>
>
> Here's another toy example that's closer to the problem you're trying
> to solve.  Let's say that we're working on a program to alphabetize
> the words of a sentence.  Very useless, of course.  :P  We might pass
> it in the input:
>
>     this
>     is
>     a
>     test
>     of
>     the
>     emergency
>     broadcast
>     system
>
> and expect to get back the following sentence:
>
>      hist
>      is
>      a
>      estt
>      fo
>      eht
>      ceeegmnry
>      aabcdorst
>      emssty
>
> We can imagine one process doing chunking, going from a sequence of
> characters to a sequence of words:
>
> ###########################################
> def extract_words(seq):
>     """Yield the words in a sequence of characters."""
>     buffer = []
>     for ch in seq:
>         if ch.isalpha():
>             buffer.append(ch)
>         elif buffer:
>             yield ''.join(buffer)
>             del buffer[:]
>     # If we hit the end of the buffer, we still might
>     # need to yield one more result.
>     if buffer:
>         yield ''.join(buffer)
> ###########################################
>
>
> and a function that transforms words to their munged counterpart:
>
> #########################
> def transform(word):
>     """"Munges a word into its alphabetized form."""
>     chars = list(word)
>     chars.sort()
>     return ''.join(chars)
> #########################
>
> This forms the major components of a program that can do the munging
> on a file... or a socket!
>
>
> Here's the complete example:
>
>
> #############################################
> import sys
>
> def extract_words(seq):
>     """Yield the words in a sequence of characters."""
>     buffer = []
>     for ch in seq:
>         if ch.isalpha():
>             buffer.append(ch)
>         elif buffer:
>             yield ''.join(buffer)
>             del buffer[:]
>     # If we hit the end of the buffer, we still might
>     # need to yield one more result.
>     if buffer:
>         yield ''.join(buffer)
>
> def transform(word):
>     """"Munges a word into its alphabetized form."""
>    chars = list(word)
>     chars.sort()
>     return ''.join(chars)
>
>
> def as_byte_seq(f):
>     """Return the bytes of the file-like object f as a
>     sequence."""
>     while True:
>         ch = f.read(1)
>         if not ch: break
>         yield ch
>
>
> if __name__ == '__main__':
>     for word in extract_words(as_byte_seq(sys.stdin)):
>         print(transform(word))
> ############################################
>
>
>
> If you have questions, please feel free to ask.  Good luck!
>



-- 

All internal models of the world are approximate. ~ Sebastian Thrun
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to