hold the phone!!!! I have no idea why it worked, would love an explanation, but I changed my previous test script by eliminating
for tag in ("icdm"): and changing if tag in line to if 'icdm' in line: and it works perfectly! It only iterates over the file once, and the else executes so both types of lines format correctly except for the multiple identifiers in the non-icdm lines I could still use some help with that bit, please. regards, Richard On Wed, Jun 3, 2015 at 4:13 PM, richard kappler <richkapp...@gmail.com> wrote: > I was trying to keep it simple, you'd think by now I'd know better. My > fault and my apology. > > It's definitely not all dates and times, the data and character types > vary. This is the output from my log parser script which you helped on the > other day. there are essentially two types of line: > > Tue Jun 2 10:22:42 2015<usertag1 > name="SE">SE201506012200310389PS01CT1407166S0011.40009.00007.6IN > 000000000018.1LB000258]C10259612019466862270088094]L0223PDF</usertag1> > Tue Jun 2 10:22:43 2015<usertag1 > name="SE">SE0389icdim01307755C0038.20033.20012.0IN1000000000 > 0032]C10259612804038813568089577</usertag1> > > I have to do several things: > the first type can be of variable length, everything after the ] is an > identifier that I have to separate, some lines have one, some have more > than one, variable length, always delimited by a ] > the second type (line 2) doesn't have the internal datetime stamp, so I > just need to add 14 x's to fill in the space where that date time stamp > would be. > > and finally, I have to break these apart and put a descriptor with each. > > While I was waiting for a response to this, I put together a script to > start figuring things out (what could possibly go wrong?!?!?! :-) ) > > and I can't post the exact script but the following is the guts of it: > > f1 = open('unformatted.log', 'r') > f2 = open('formatted.log', 'a') > > for line in f1: > for tag in ("icdm"): > if tag in line: > newline = 'log datestamp:' + line[0:24] # + and so on to > format the lines with icdm in them including adding 14 x's for the missing > timestamp > f2.write(newline) #write the formatted output to the new log > else: > newline = 'log datestamp:' + line[0:24] # + and so on to > format the non-icdm lines > f2.write(newline) > > The problems are: > 1. for some reason this iterates over the 24 line file 5 times, and it > writes the 14 x's to every file, so my non-icdm code (the else:) isn't > getting executed. I'm missing something basic and obvious but have no idea > what. > 2. I still don't know how to handle the differences in the end of the > non-icdm files (potentially more than identifier ] delimited as described > above). > > regards, Richard > > On Wed, Jun 3, 2015 at 3:53 PM, Alan Gauld <alan.ga...@btinternet.com> > wrote: > >> On 03/06/15 20:10, richard kappler wrote: >> >>> for formatting a string and adding descriptors: >>> >>> test = 'datetimepart1part2part3the_rest' >>> >> >> If this is really about parsing dates and times have >> you looked at the datetime module and its parsing/formatting >> functions (ie strptime/strftime)? >> >> Can I stop using position numbers and start looking for specific >>> characters >>> (the delimiter) and proceed to the end (which is always a constant >>> string >>> btw). >>> >> >> The general answer is probably to look at regular expressions. >> But they get messy fast so usually I'd suggest trying regular >> string searches/replaces and splits first. >> >> But if your pattern is genuinely complex and variable then >> regex may be the right solution. >> >> But if its dates check the strptime() functions first. >> >> >> -- >> Alan G >> Author of the Learn to Program web site >> http://www.alan-g.me.uk/ >> http://www.amazon.com/author/alan_gauld >> Follow my photo-blog on Flickr at: >> http://www.flickr.com/photos/alangauldphotos >> >> >> _______________________________________________ >> Tutor maillist - Tutor@python.org >> To unsubscribe or change subscription options: >> https://mail.python.org/mailman/listinfo/tutor >> > > > > -- > > Windows assumes you are an idiot…Linux demands proof. > -- Windows assumes you are an idiot…Linux demands proof. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor