On Sep 16, 2:48 pm, "Karl Kobata" <[EMAIL PROTECTED]> wrote: > Hi Fredrik, > > This is exactly what I need. Thank you. > I would like to do one additional function. I am not using the tokenizer to > parse python code. It happens to work very well for my application. > However, I would like either or both of the following variance: > 1) I would like to add 2 other characters as comment designation > 2) write a module that can readline, modify the line as required, and > finally, this module can be used as the argument for the tokenizer. > > Def modifyLine( fileHandle ): > # readline and modify this string if required > ... > > For token in tokenize.generate_tokens( modifyLine( myFileHandle ) ): > Print token > > Anxiously looking forward to your thoughts. > karl > > -----Original Message----- > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of > Fredrik Lundh > Sent: Monday, September 15, 2008 2:04 PM > To: [EMAIL PROTECTED] > Subject: Re: ka-ping yee tokenizer.py > > Karl Kobata wrote: > > > I have enjoyed using ka-ping yee's tokenizer.py. I would like to > > replace the readline parameter input with my own and pass a list of > > strings to the tokenizer. I understand it must be a callable object and > > iteratable but it is obvious with errors I am getting, that this is not > > the only functions required. > > not sure I can decipher your detailed requirements, but to use Python's > standard "tokenize" module (written by ping) on a list, you can simple > do as follows: > > import tokenize > > program = [ ... program given as list ... ] > > for token in tokenize.generate_tokens(iter(program).next): > print token > > another approach is to turn the list back into a string, and wrap that > in a StringIO object: > > import tokenize > import StringIO > > program = [ ... program given as list ... ] > > program_buffer = StringIO.StringIO("".join(program)) > > for token in tokenize.generate_tokens(program_buffer.readline): > print token > > </F> > > --http://mail.python.org/mailman/listinfo/python-list > >
This is an interesting construction: >>> a= [ 'a', 'b', 'c' ] >>> def moditer( mod, nextfun ): ... while 1: ... yield mod( nextfun( ) ) ... >>> list( moditer( ord, iter( a ).next ) ) [97, 98, 99] Here's my point: >>> a= [ 'print a', 'print b', 'print c' ] >>> tokenize.generate_tokens( iter( a ).next ) <generator object at 0x009FF440> >>> tokenize.generate_tokens( moditer( lambda s: s+ '#', iter( a ).next ).next ) It adds a '#' to the end of every line, then tokenizes. -- http://mail.python.org/mailman/listinfo/python-list