On Apr 13, 3:59 pm, "Flyzone" <[EMAIL PROTECTED]> wrote: > Hi, > i have a problem with the split function and regexp. > I have a file that i want to split using the date as token. > Here a sample: > ----- > Mon Apr 9 22:30:18 2007 > text > text > Mon Apr 9 22:31:10 2007 > text > text > ---- > > I'm trying to put all the lines in a one string and then to separate > it > (could be better to not delete the \n if possible...) > while 1: > line = ftoparse.readline() > if not line: break > if line[-1]=='\n': line=line[:-1] > file_str += line > matchobj=re.compile('[A-Z][a-z][a-z][ ][A-Z][a-z][a-z][ ][0-9| ][0-9] > [ ][0-9][0-9][:]') > matchobj=matchobj.split(file_str) > print matchobj > > i have tried also > matchobj=re.split(r"^[A-Z][a-z][a-z][ ][A-Z][a-z][a-z][ ][0-9| ] > [0-9][ ][0-9][0-9][:]",file_str) > and reading all with one: > file_str=ftoparse.readlines() > but the split doesn't work...where i am wronging?
you trying to match the date part right? if re is what you desire, here's one example: >>> data = open("file").read() >>> pat = re.compile("[A-Z][a-z]{2} [A-Z][a-z]{2} >>> \d{,2}\s+\d{,2}:\d{,2}:\d{,2} \d{4}",re.M|re.DOTALL) >>> print pat.findall(data) ['Mon Apr 9 22:30:18 2007', 'Mon Apr 9 22:31:10 2007'] -- http://mail.python.org/mailman/listinfo/python-list