qwweeeit wrote: > Thank you for your suggestion, but it is too complicated for me... > I decided to proceed in steps: > 1. Take away all commented lines > 2. Rebuild the multi-lines as single lines ummm, Ok all i can say is did you try this? if not save it as a module then import it into the interperter and try it. This is a dead simple module to do *exactly* what you asked for :) Like i said I have done this before so I will restate *I HAVE FAILED AT THIS BEFORE, MANY TIMES*. Now I have a solution. It handles stdio by default but can write to a filelike object if you give it one. Handles continued lines already, no need to futz around with some solution. Here is an example: Py> filein = """ ... class Stripper: ... '''python comment and whitespace stripper ... ''' ... def __init__(self, raw): ... ''' Store the source text & set some flags. ... ''' ... self.raw = raw ... ... def format(self, out=sys.stdout, comments=0, ... spaces=1, untabify=1,eol='unix'): ... '''Parse and send the colored source.''' ... # Store line offsets in self.lines ... self.lines = [0, 0] ... pos = 0 ... # Strips the first blank line if 1 ... self.lasttoken = 1 ... self.temp = StringIO.StringIO() ... self.spaces = spaces ... self.comments = comments ... ... if untabify: ... self.raw = self.raw.expandtabs() ... self.raw = self.raw.rstrip()+' ' ... self.out = out ... """ Py> replacer = ReplaceParser(filein, out=sys.stdout) Py> replacer.format() class Stripper: s000001 def __init__(self, raw): s000002 self.raw = raw
def format(self, out=sys.stdout, comments=0, spaces=1, untabify=1,eol=s000003): s000004 # Store line offsets in self.lines self.lines = [0, 0] pos = 0 # Strips the first blank line if 1 self.lasttoken = 1 self.temp = StringIO.StringIO() self.spaces = spaces self.comments = comments if untabify: self.raw = self.raw.expandtabs() self.raw = self.raw.rstrip()+s000005 self.out = out Py> replacer.StringMap {'s000004': "'''Parse and send the colored source.'''", 's000005': "' '", 's000001': "'''python comment and whitespace stripper :)\n '''", 's000002': "''' Store the source text & set some flags.\n '''", 's000003': "'unix'"} You can also strip out comments with a few line. It can easily get single comments or doubles. add this in your __call__ function: [snip] self.pos = newpos return # kills comments if (toktype == tokenize.COMMENT): return if (toktype == token.STRING): sname = self.StringName.next() [snip] If you insist on writing something go ahead. Let me know what your solution is, I am curious. M.E.Farmer -- http://mail.python.org/mailman/listinfo/python-list