Re: Fwd: Python new user question - file writeline error
En Thu, 08 Feb 2007 14:20:57 -0300, Shawn Milo <[EMAIL PROTECTED]> escribió: > On 8 Feb 2007 09:05:51 -0800, Gabriel Genellina <[EMAIL PROTECTED]> > wrote: >> On 8 feb, 12:41, "Shawn Milo" <[EMAIL PROTECTED]> wrote: >> >> > I have come up with something that's working fine. However, I'm fairly >> > new to Python, so I'd really appreciate any suggestions on how this >> > can be made more Pythonic. >> >> A few comments: >> >> You don't need the formatDatePart function; delete it, and replace >> newDate = ",%s-%s-%s," % (yearNum,monthNum,dayNum) >> with >> newDate = ",%04.4d-%02.2d-%02.2d," % (yearNum,monthNum,dayNum) >> >> and before: >> dayNum, monthNum, yearNum = [int(num) for num in >> someDate[1:-1].split('/')] >> >> And this: outfile.writelines(line) >> should be: outfile.write(line) >> (writelines works almost by accident here). >> >> You forget again to use () to call the close methods: >> infile.close() >> outfile.close() >> >> I don't like the final replace, but for a script like this I think >> it's OK. >> >> -- >> Gabriel Genellina >> >> -- >> http://mail.python.org/mailman/listinfo/python-list >> > > > Gabriel, > > Thanks for the comments! The new version is below. I thought it made a > little more sense to format the newDate = ... line the way I have it > below, although I did incorporate your suggestions. Looks pretty good for me! Just one little thing I would change, the variables monthNum, dayNum etc.; the suffix might indicate that they're numbers, but they're strings instead. So I would move the int(...) a few lines above, where the variables are defined. But that's just a cosmetic thing and just a matter of taste. > Also, the > formatting options you provided seemed to specify not only string > padding, but also decimal places, so I changed it. Please let me know > if there is some other meaning behind the way you did it. No, it has no meaning, at least for this range of values. > As for not liking the replace line, what would you suggest instead? You already have scanned the line to find the matching fragment; the match object knows exactly where it begins and ends; so one could replace it with the reformatted value without searching again, wich takes some more time, at least in principle. But this makes the code a bit more complex, and it would only make sense if you were to process millions of lines, and even then, the execution might be I/O-bound so you would gain nothing at the end. That's why I think it's OK as it is now. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: Fwd: Python new user question - file writeline error
On Feb 8, 3:26 pm, Bruno Desthuilliers <[EMAIL PROTECTED]> wrote: > Shawn Milo a écrit : > > > > > To the list: > > > I have come up with something that's working fine. However, I'm fairly > > new to Python, so I'd really appreciate any suggestions on how this > > can be made more Pythonic. > > > Thanks, > > Shawn > > > Okay, here's what I have come up with: > > > #! /usr/bin/python > > > import sys > > import re > > > month > > ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} > > > infile=file('TVA-0316','r') > > outfile=file('tmp.out','w') > > > def formatDatePart(x): > >"take a number and transform it into a two-character string, > > zero padded" > >x = str(x) > >while len(x) < 2: > >x = "0" + x > >return x > > x = "%02d" % x > > > regex = re.compile(r",\d{2}/[A-Z]{3}/\d{4},") > > regexps are not really pythonic - we tend to use them only when we have > no better option. When it comes to parsing CSV files and/or dates, we do > have better solution : the csv module and the datetime module > > > for line in infile: > >matches = regex.findall(line) > >for someDate in matches: > > >dayNum = formatDatePart(someDate[1:3]) > >monthNum = formatDatePart(month[someDate[4:7]]) > >yearNum = formatDatePart(someDate[8:12]) > > >newDate = ",%s-%s-%s," % (yearNum,monthNum,dayNum) > >line = line.replace(someDate, newDate) > >outfile.writelines(line) > > > infile.close > > outfile.close > > I wonder why some of us took time to answer your first question. You > obviously forgot to read these answers. No offense - but the fact that 're' module is available, doesn't that mean we can use it? (Pythonic or not - not sure what is really pythonic at this stage of learning...) Like Perl, I'm sure there are more than one way to solve problems in Python. I appreciate everyone's feedback - I definitely got more than expected, but it feels comforting that people do care about writing better codes! :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Fwd: Python new user question - file writeline error
Shawn Milo a écrit : > To the list: > > I have come up with something that's working fine. However, I'm fairly > new to Python, so I'd really appreciate any suggestions on how this > can be made more Pythonic. > > Thanks, > Shawn > > > > > > > Okay, here's what I have come up with: > > > #! /usr/bin/python > > import sys > import re > > month > ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} > > > infile=file('TVA-0316','r') > outfile=file('tmp.out','w') > > def formatDatePart(x): >"take a number and transform it into a two-character string, > zero padded" >x = str(x) >while len(x) < 2: >x = "0" + x >return x x = "%02d" % x > regex = re.compile(r",\d{2}/[A-Z]{3}/\d{4},") regexps are not really pythonic - we tend to use them only when we have no better option. When it comes to parsing CSV files and/or dates, we do have better solution : the csv module and the datetime module > for line in infile: >matches = regex.findall(line) >for someDate in matches: > >dayNum = formatDatePart(someDate[1:3]) >monthNum = formatDatePart(month[someDate[4:7]]) >yearNum = formatDatePart(someDate[8:12]) > >newDate = ",%s-%s-%s," % (yearNum,monthNum,dayNum) >line = line.replace(someDate, newDate) >outfile.writelines(line) > > infile.close > outfile.close I wonder why some of us took time to answer your first question. You obviously forgot to read these answers. -- http://mail.python.org/mailman/listinfo/python-list
Re: Fwd: Python new user question - file writeline error
On 2/8/07, Jussi Salmela <[EMAIL PROTECTED]> wrote: > Shawn Milo kirjoitti: > > To the list: > > > > I have come up with something that's working fine. However, I'm fairly > > new to Python, so I'd really appreciate any suggestions on how this > > can be made more Pythonic. > > > > Thanks, > > Shawn > > > > > > > > > > > > > > Okay, here's what I have come up with: > > What follows may feel harsh but you asked for it ;) > > > > > > > #! /usr/bin/python > > > > import sys > > import re > > > > month > > ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} > > > > infile=file('TVA-0316','r') > > outfile=file('tmp.out','w') > > > > def formatDatePart(x): > >"take a number and transform it into a two-character string, > > zero padded" > If a comment or doc string is misleading one would be better off without > it entirely: > "take a number": the function can in fact take (at least) > any base type > "transform it": the function doesn't transform x to anything > although the name of the variable x is the same > as the argument x > "two-character string": to a string of at least 2 chars > "zero padded": where left/right??? > >x = str(x) > >while len(x) < 2: > >x = "0" + x > You don't need loops for these kind of things. One possibility is to > replace the whole body with: > return str(x).zfill(2) > >return x > > > > regex = re.compile(r",\d{2}/[A-Z]{3}/\d{4},") > > > > for line in infile: > >matches = regex.findall(line) > >for someDate in matches: > > > Empty lines are supposed to make code more readable. The above empty > line does the contrary by separating the block controlled by the for > and the for statement > >dayNum = formatDatePart(someDate[1:3]) > >monthNum = formatDatePart(month[someDate[4:7]]) > >yearNum = formatDatePart(someDate[8:12]) > You don't need the formatDatePart function at all: > newDate = ",%4s-%02d-%2s," % \ > (someDate[8:12],month[someDate[4:7]],someDate[1:3]) > > > >newDate = ",%s-%s-%s," % (yearNum,monthNum,dayNum) > >line = line.replace(someDate, newDate) > > > >outfile.writelines(line) > > > > infile.close > > outfile.close > You have not read the answers given to the OP, have you. Because if you > had, your code would be: > infile.close() > outfile.close() > The reason your version seems to be working, is that you probably > execute your code from the command-line and exiting from Python to > command-line closes the files, even if you don't. > > Cheers, > Jussi > -- > http://mail.python.org/mailman/listinfo/python-list > Jussi, Thanks for the feedback. I received similar comments on a couple of those items, and posted a newer version an hour or two ago. I think the only thing missing there is a friendly blank line after my "for line in infile:" statement. Please let me know if there is anything else. Shawn -- http://mail.python.org/mailman/listinfo/python-list
Re: Fwd: Python new user question - file writeline error
Shawn Milo kirjoitti: > To the list: > > I have come up with something that's working fine. However, I'm fairly > new to Python, so I'd really appreciate any suggestions on how this > can be made more Pythonic. > > Thanks, > Shawn > > > > > > > Okay, here's what I have come up with: What follows may feel harsh but you asked for it ;) > > > #! /usr/bin/python > > import sys > import re > > month > ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} > > > infile=file('TVA-0316','r') > outfile=file('tmp.out','w') > > def formatDatePart(x): >"take a number and transform it into a two-character string, > zero padded" If a comment or doc string is misleading one would be better off without it entirely: "take a number": the function can in fact take (at least) any base type "transform it": the function doesn't transform x to anything although the name of the variable x is the same as the argument x "two-character string": to a string of at least 2 chars "zero padded": where left/right??? >x = str(x) >while len(x) < 2: >x = "0" + x You don't need loops for these kind of things. One possibility is to replace the whole body with: return str(x).zfill(2) >return x > > regex = re.compile(r",\d{2}/[A-Z]{3}/\d{4},") > > for line in infile: >matches = regex.findall(line) >for someDate in matches: > Empty lines are supposed to make code more readable. The above empty line does the contrary by separating the block controlled by the for and the for statement >dayNum = formatDatePart(someDate[1:3]) >monthNum = formatDatePart(month[someDate[4:7]]) >yearNum = formatDatePart(someDate[8:12]) You don't need the formatDatePart function at all: newDate = ",%4s-%02d-%2s," % \ (someDate[8:12],month[someDate[4:7]],someDate[1:3]) > >newDate = ",%s-%s-%s," % (yearNum,monthNum,dayNum) >line = line.replace(someDate, newDate) > >outfile.writelines(line) > > infile.close > outfile.close You have not read the answers given to the OP, have you. Because if you had, your code would be: infile.close() outfile.close() The reason your version seems to be working, is that you probably execute your code from the command-line and exiting from Python to command-line closes the files, even if you don't. Cheers, Jussi -- http://mail.python.org/mailman/listinfo/python-list
Re: Fwd: Python new user question - file writeline error
On 8 Feb 2007 09:05:51 -0800, Gabriel Genellina <[EMAIL PROTECTED]> wrote: > On 8 feb, 12:41, "Shawn Milo" <[EMAIL PROTECTED]> wrote: > > > I have come up with something that's working fine. However, I'm fairly > > new to Python, so I'd really appreciate any suggestions on how this > > can be made more Pythonic. > > A few comments: > > You don't need the formatDatePart function; delete it, and replace > newDate = ",%s-%s-%s," % (yearNum,monthNum,dayNum) > with > newDate = ",%04.4d-%02.2d-%02.2d," % (yearNum,monthNum,dayNum) > > and before: > dayNum, monthNum, yearNum = [int(num) for num in > someDate[1:-1].split('/')] > > And this: outfile.writelines(line) > should be: outfile.write(line) > (writelines works almost by accident here). > > You forget again to use () to call the close methods: > infile.close() > outfile.close() > > I don't like the final replace, but for a script like this I think > it's OK. > > -- > Gabriel Genellina > > -- > http://mail.python.org/mailman/listinfo/python-list > Gabriel, Thanks for the comments! The new version is below. I thought it made a little more sense to format the newDate = ... line the way I have it below, although I did incorporate your suggestions. Also, the formatting options you provided seemed to specify not only string padding, but also decimal places, so I changed it. Please let me know if there is some other meaning behind the way you did it. As for not liking the replace line, what would you suggest instead? Shawn #! /usr/bin/python import sys import re month ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} infile=file('TVA-0316','r') outfile=file('tmp.out','w') regex = re.compile(r",\d{2}/[A-Z]{3}/\d{4},") for line in infile: matches = regex.findall(line) for someDate in matches: dayNum = someDate[1:3] monthNum = month[someDate[4:7]] yearNum = someDate[8:12] newDate = ",%04d-%02d-%02d," % (int(yearNum),int(monthNum),int(dayNum)) line = line.replace(someDate, newDate) outfile.write(line) infile.close() outfile.close() -- http://mail.python.org/mailman/listinfo/python-list
Re: Fwd: Python new user question - file writeline error
On 8 feb, 12:41, "Shawn Milo" <[EMAIL PROTECTED]> wrote: > I have come up with something that's working fine. However, I'm fairly > new to Python, so I'd really appreciate any suggestions on how this > can be made more Pythonic. A few comments: You don't need the formatDatePart function; delete it, and replace newDate = ",%s-%s-%s," % (yearNum,monthNum,dayNum) with newDate = ",%04.4d-%02.2d-%02.2d," % (yearNum,monthNum,dayNum) and before: dayNum, monthNum, yearNum = [int(num) for num in someDate[1:-1].split('/')] And this: outfile.writelines(line) should be: outfile.write(line) (writelines works almost by accident here). You forget again to use () to call the close methods: infile.close() outfile.close() I don't like the final replace, but for a script like this I think it's OK. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Fwd: Python new user question - file writeline error
To the list: I have come up with something that's working fine. However, I'm fairly new to Python, so I'd really appreciate any suggestions on how this can be made more Pythonic. Thanks, Shawn Okay, here's what I have come up with: #! /usr/bin/python import sys import re month ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} infile=file('TVA-0316','r') outfile=file('tmp.out','w') def formatDatePart(x): "take a number and transform it into a two-character string, zero padded" x = str(x) while len(x) < 2: x = "0" + x return x regex = re.compile(r",\d{2}/[A-Z]{3}/\d{4},") for line in infile: matches = regex.findall(line) for someDate in matches: dayNum = formatDatePart(someDate[1:3]) monthNum = formatDatePart(month[someDate[4:7]]) yearNum = formatDatePart(someDate[8:12]) newDate = ",%s-%s-%s," % (yearNum,monthNum,dayNum) line = line.replace(someDate, newDate) outfile.writelines(line) infile.close outfile.close -- http://mail.python.org/mailman/listinfo/python-list
Re: Python new user question - file writeline error
James a écrit : > On Feb 7, 4:59 pm, "Shawn Milo" <[EMAIL PROTECTED]> wrote: > (snip) >>I'm pretty new to Python myself, but if you'd like help with a >>Perl/regex solution, I'm up for it. For that matter, whipping up a >>Python/regex solution would probably be good for me. Let me know. >> >>Shawn > > > Thank you very much for your kind offer. > I'm also coming from Perl myself - heard many good things about Python > so I'm trying it out - but it seems harder than I thought :( If I may comment, Python is not Perl, and trying to solve things the Perl way, while still possible, may not be the best idea (I don't mean Perl is a bad idea in itself - just that it's another language with another way to do things). Here, doing the parsing oneself - either manually as james did or with regexps - is certainly not as easy as with Perl, and IMHO not the simplest way to go, when the csv module can take care of parsing and formatting CSV files and the datetime module of parsing and formatting dates. Just my 2 cents... -- http://mail.python.org/mailman/listinfo/python-list
Re: Python new user question - file writeline error
On 7 Feb 2007 11:31:32 -0800, James <[EMAIL PROTECTED]> wrote: > I have this code: ... > infile.close > outfile.close ... > 1. the outfile doesn't complete with no error message. when I check > the last line in the python interpreter, it has read and processed the > last line, but the output file stopped before. You need to call the close methods on your file objects like this: outfile.close() If you leave off the parentheses, you get the method object, but don't do anything with it. > 2. Is this the best way to do this in Python? I would parse your dates using the python time module, like this: Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)] on win32 IDLE 1.2 >>> import time >>> line = >>> r'06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,,' >>> item = line.split(',') >>> time.strftime('%a, %d %b %Y', timedate) 'Sun, 05 Mar 1950' >>> dob = item[6] >>> dob_time = time.strptime(dob, '%d/%b/%Y') >>> dob_time (1950, 3, 5, 0, 0, 0, 6, 64, -1) >>> time.strftime('%Y-%m-%d', dob_time) '1950-03-05' See the docs for the time module here: http://docs.python.org/lib/module-time.html Using that will probably result in code that's quite a bit easier to read if you ever have to come back to it. You also might want to investigate the csv module (http://docs.python.org/lib/module-csv.html) for a bunch of tools specifically tailored to working with files full of comma separated values like your input files. -- Jerry -- http://mail.python.org/mailman/listinfo/python-list
Re: Python new user question - file writeline error
On Feb 7, 4:59 pm, "Shawn Milo" <[EMAIL PROTECTED]> wrote: > On 7 Feb 2007 11:31:32 -0800, James <[EMAIL PROTECTED]> wrote: > > > > > Hello, > > > I'm a newbie to Python & wondering someone can help me with this... > > > I have this code: > > -- > > #! /usr/bin/python > > > import sys > > > month ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG': > > 8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} > > infile=file('TVA-0316','r') > > outfile=file('tmp.out','w') > > > for line in infile: > > item = line.split(',') > > dob = item[6].split('/') > > dob = dob[2]+'-'+str(month[dob[1]])+'-'+dob[0] > > lbdt = item[8].split('/') > > lbdt = lbdt[2]+'-'+str(month[lbdt[1]])+'-'+lbdt[0] > > lbrc = item[10].split('/') > > lbrc = lbrc[2]+'-'+str(month[lbrc[1]])+'-'+lbrc[0] > > lbrp = item[14].split('/') > > lbrp = lbrp[2]+'-'+str(month[lbrp[1]])+'-'+lbrp[0] > > item[6] = dob > > item[8] = lbdt > > item[10]=lbrc > > item[14]=lbrp > > list = ','.join(item) > > outfile.writelines(list) > > infile.close > > outfile.close > > - > > > And the data file(TVA-0316) looks like this: > > - > > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > > NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,, > > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > > NOV/2006,V1,,,21/NOV/2006,GGT,34,U/L,11,32,h, > > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > > NOV/2006,V1,,,21/NOV/2006,ALT,31,U/L,5,29,h, > > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > > NOV/2006,V1,,,21/NOV/2006,ALKP,61,U/L,40,135,, > > - > > > Basically I'm reading in each line and converting all date fields (05/ > > MAR/1950) to different format (1950-03-05) in order to load into MySQL > > table. > > > I have two issues: > > 1. the outfile doesn't complete with no error message. when I check > > the last line in the python interpreter, it has read and processed the > > last line, but the output file stopped before. > > 2. Is this the best way to do this in Python? > > 3. (Out of scope) is there a way to load this CSV file directly into > > MySQL data field without converting the format? > > > Thank you. > > > James > > > -- > >http://mail.python.org/mailman/listinfo/python-list > > Your script worked for me. I'm not sure what the next step is in > troubleshooting it. Is it possible that your whitespace isn't quite > right? I had to reformat it, but I assume it was because of the way > cut & paste worked from Gmail. > > I usually use Perl for data stuff like this, but I don't see why > Python wouldn't be a great solution. However, I would re-write it > using regexes, to seek and replace sections that are formatted like a > date, rather than breaking it into a variable for each field, changing > each date individually, then putting them back together. > > As for how MySQL likes having dates formatted in CSV input: I can't > help there, but I'm sure someone else can. > > I'm pretty new to Python myself, but if you'd like help with a > Perl/regex solution, I'm up for it. For that matter, whipping up a > Python/regex solution would probably be good for me. Let me know. > > Shawn Thank you very much for your kind offer. I'm also coming from Perl myself - heard many good things about Python so I'm trying it out - but it seems harder than I thought :( James -- http://mail.python.org/mailman/listinfo/python-list
Re: Python new user question - file writeline error
On 7 Feb 2007 11:31:32 -0800, James <[EMAIL PROTECTED]> wrote: > Hello, > > I'm a newbie to Python & wondering someone can help me with this... > > I have this code: > -- > #! /usr/bin/python > > import sys > > month ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG': > 8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} > infile=file('TVA-0316','r') > outfile=file('tmp.out','w') > > for line in infile: > item = line.split(',') > dob = item[6].split('/') > dob = dob[2]+'-'+str(month[dob[1]])+'-'+dob[0] > lbdt = item[8].split('/') > lbdt = lbdt[2]+'-'+str(month[lbdt[1]])+'-'+lbdt[0] > lbrc = item[10].split('/') > lbrc = lbrc[2]+'-'+str(month[lbrc[1]])+'-'+lbrc[0] > lbrp = item[14].split('/') > lbrp = lbrp[2]+'-'+str(month[lbrp[1]])+'-'+lbrp[0] > item[6] = dob > item[8] = lbdt > item[10]=lbrc > item[14]=lbrp > list = ','.join(item) > outfile.writelines(list) > infile.close > outfile.close > - > > And the data file(TVA-0316) looks like this: > - > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,, > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > NOV/2006,V1,,,21/NOV/2006,GGT,34,U/L,11,32,h, > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > NOV/2006,V1,,,21/NOV/2006,ALT,31,U/L,5,29,h, > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > NOV/2006,V1,,,21/NOV/2006,ALKP,61,U/L,40,135,, > - > > Basically I'm reading in each line and converting all date fields (05/ > MAR/1950) to different format (1950-03-05) in order to load into MySQL > table. > > I have two issues: > 1. the outfile doesn't complete with no error message. when I check > the last line in the python interpreter, it has read and processed the > last line, but the output file stopped before. > 2. Is this the best way to do this in Python? > 3. (Out of scope) is there a way to load this CSV file directly into > MySQL data field without converting the format? > > Thank you. > > James > > -- > http://mail.python.org/mailman/listinfo/python-list > Your script worked for me. I'm not sure what the next step is in troubleshooting it. Is it possible that your whitespace isn't quite right? I had to reformat it, but I assume it was because of the way cut & paste worked from Gmail. I usually use Perl for data stuff like this, but I don't see why Python wouldn't be a great solution. However, I would re-write it using regexes, to seek and replace sections that are formatted like a date, rather than breaking it into a variable for each field, changing each date individually, then putting them back together. As for how MySQL likes having dates formatted in CSV input: I can't help there, but I'm sure someone else can. I'm pretty new to Python myself, but if you'd like help with a Perl/regex solution, I'm up for it. For that matter, whipping up a Python/regex solution would probably be good for me. Let me know. Shawn -- http://mail.python.org/mailman/listinfo/python-list
Re: Python new user question - file writeline error
James a écrit : > Hello, > > I'm a newbie to Python & wondering someone can help me with this... > > I have this code: > -- > #! /usr/bin/python > > import sys > > month ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG': > 8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} > infile=file('TVA-0316','r') > outfile=file('tmp.out','w') > > for line in infile: > item = line.split(',') CSV format ? http://docs.python.org/lib/module-csv.html > dob = item[6].split('/') > dob = dob[2]+'-'+str(month[dob[1]])+'-'+dob[0] Why did you use integers as values in the month dict if it's for using them as strings ? > lbdt = item[8].split('/') > lbdt = lbdt[2]+'-'+str(month[lbdt[1]])+'-'+lbdt[0] > lbrc = item[10].split('/') > lbrc = lbrc[2]+'-'+str(month[lbrc[1]])+'-'+lbrc[0] > lbrp = item[14].split('/') > lbrp = lbrp[2]+'-'+str(month[lbrp[1]])+'-'+lbrp[0] This may help too: http://docs.python.org/lib/module-datetime.html > item[6] = dob > item[8] = lbdt > item[10]=lbrc > item[14]=lbrp > list = ','.join(item) Better to avoid using builtin types names as identifiers. And FWIW, this is *not* a list... > outfile.writelines(list) You want file.writeline() or file.write(). And you have to manually add the newline. > infile.close You're not actually *calling* infile.close - just getting a reference on the file.close method. The parens are not optional in Python, they are the call operator. > outfile.close Idem. > - > > And the data file(TVA-0316) looks like this: > - > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,, > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > NOV/2006,V1,,,21/NOV/2006,GGT,34,U/L,11,32,h, > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > NOV/2006,V1,,,21/NOV/2006,ALT,31,U/L,5,29,h, > 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ > NOV/2006,V1,,,21/NOV/2006,ALKP,61,U/L,40,135,, > - > > Basically I'm reading in each line and converting all date fields (05/ > MAR/1950) to different format (1950-03-05) in order to load into MySQL > table. > > I have two issues: > 1. the outfile doesn't complete with no error message. when I check > the last line in the python interpreter, it has read and processed the > last line, but the output file stopped before. Use the csv module and cleanly close your files, then come back if you still have problems. > 2. Is this the best way to do this in Python? Err... What to say... Obviously, no. -- http://mail.python.org/mailman/listinfo/python-list
Python new user question - file writeline error
Hello, I'm a newbie to Python & wondering someone can help me with this... I have this code: -- #! /usr/bin/python import sys month ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG': 8,'SEP':9,'OCT':10,'NOV':11,'DEC':12} infile=file('TVA-0316','r') outfile=file('tmp.out','w') for line in infile: item = line.split(',') dob = item[6].split('/') dob = dob[2]+'-'+str(month[dob[1]])+'-'+dob[0] lbdt = item[8].split('/') lbdt = lbdt[2]+'-'+str(month[lbdt[1]])+'-'+lbdt[0] lbrc = item[10].split('/') lbrc = lbrc[2]+'-'+str(month[lbrc[1]])+'-'+lbrc[0] lbrp = item[14].split('/') lbrp = lbrp[2]+'-'+str(month[lbrp[1]])+'-'+lbrp[0] item[6] = dob item[8] = lbdt item[10]=lbrc item[14]=lbrp list = ','.join(item) outfile.writelines(list) infile.close outfile.close - And the data file(TVA-0316) looks like this: - 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,, 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ NOV/2006,V1,,,21/NOV/2006,GGT,34,U/L,11,32,h, 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ NOV/2006,V1,,,21/NOV/2006,ALT,31,U/L,5,29,h, 06-0588,03,701,03701,046613,JJB,05/MAR/1950,M,20/NOV/2006,08:50,21/ NOV/2006,V1,,,21/NOV/2006,ALKP,61,U/L,40,135,, - Basically I'm reading in each line and converting all date fields (05/ MAR/1950) to different format (1950-03-05) in order to load into MySQL table. I have two issues: 1. the outfile doesn't complete with no error message. when I check the last line in the python interpreter, it has read and processed the last line, but the output file stopped before. 2. Is this the best way to do this in Python? 3. (Out of scope) is there a way to load this CSV file directly into MySQL data field without converting the format? Thank you. James -- http://mail.python.org/mailman/listinfo/python-list