Re: Fwd: Python new user question - file writeline error

2007-02-08 Thread Gabriel Genellina
On 8 feb, 12:41, Shawn Milo [EMAIL PROTECTED] wrote:

 I have come up with something that's working fine. However, I'm fairly
 new to Python, so I'd really appreciate any suggestions on how this
 can be made more Pythonic.

A few comments:

You don't need the formatDatePart function; delete it, and replace
newDate = ,%s-%s-%s, % (yearNum,monthNum,dayNum)
with
newDate = ,%04.4d-%02.2d-%02.2d, % (yearNum,monthNum,dayNum)

and before:
dayNum, monthNum, yearNum = [int(num) for num in
someDate[1:-1].split('/')]

And this: outfile.writelines(line)
should be: outfile.write(line)
(writelines works almost by accident here).

You forget again to use () to call the close methods:
infile.close()
outfile.close()

I don't like the final replace, but for a script like this I think
it's OK.

--
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fwd: Python new user question - file writeline error

2007-02-08 Thread Shawn Milo
On 8 Feb 2007 09:05:51 -0800, Gabriel Genellina [EMAIL PROTECTED] wrote:
 On 8 feb, 12:41, Shawn Milo [EMAIL PROTECTED] wrote:

  I have come up with something that's working fine. However, I'm fairly
  new to Python, so I'd really appreciate any suggestions on how this
  can be made more Pythonic.

 A few comments:

 You don't need the formatDatePart function; delete it, and replace
 newDate = ,%s-%s-%s, % (yearNum,monthNum,dayNum)
 with
 newDate = ,%04.4d-%02.2d-%02.2d, % (yearNum,monthNum,dayNum)

 and before:
 dayNum, monthNum, yearNum = [int(num) for num in
 someDate[1:-1].split('/')]

 And this: outfile.writelines(line)
 should be: outfile.write(line)
 (writelines works almost by accident here).

 You forget again to use () to call the close methods:
 infile.close()
 outfile.close()

 I don't like the final replace, but for a script like this I think
 it's OK.

 --
 Gabriel Genellina

 --
 http://mail.python.org/mailman/listinfo/python-list



Gabriel,

Thanks for the comments! The new version is below. I thought it made a
little more sense to format the newDate = ... line the way I have it
below, although I did incorporate your suggestions. Also, the
formatting options you provided seemed to specify not only string
padding, but also decimal places, so I changed it. Please let me know
if there is some other meaning behind the way you did it.

As for not liking the replace line, what would you suggest instead?

Shawn

#! /usr/bin/python

import sys
import re

month 
={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12}
infile=file('TVA-0316','r')
outfile=file('tmp.out','w')

regex = re.compile(r,\d{2}/[A-Z]{3}/\d{4},)

for line in infile:
matches = regex.findall(line)
for someDate in matches:

dayNum = someDate[1:3]
monthNum = month[someDate[4:7]]
yearNum = someDate[8:12]

newDate = ,%04d-%02d-%02d, %
(int(yearNum),int(monthNum),int(dayNum))
line = line.replace(someDate, newDate)

outfile.write(line)

infile.close()
outfile.close()
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fwd: Python new user question - file writeline error

2007-02-08 Thread Jussi Salmela
Shawn Milo kirjoitti:
 To the list:
 
 I have come up with something that's working fine. However, I'm fairly
 new to Python, so I'd really appreciate any suggestions on how this
 can be made more Pythonic.
 
 Thanks,
 Shawn
 
 
 
 
 
 
 Okay, here's what I have come up with:

What follows may feel harsh but you asked for it ;)

 
 
 #! /usr/bin/python
 
 import sys
 import re
 
 month 
 ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12}
  
 
 infile=file('TVA-0316','r')
 outfile=file('tmp.out','w')
 
 def formatDatePart(x):
take a number and transform it into a two-character string,
 zero padded
If a comment or doc string is misleading one would be better off without
it entirely:
take a number: the function can in fact take (at least)
any base type
transform it: the function doesn't transform x to anything
although the name of the variable x is the same
as the argument x
two-character string: to a string of at least 2 chars
zero padded: where left/right???
x = str(x)
while len(x)  2:
x = 0 + x
You don't need loops for these kind of things. One possibility is to 
replace the whole body with:
return str(x).zfill(2)
return x
 
 regex = re.compile(r,\d{2}/[A-Z]{3}/\d{4},)
 
 for line in infile:
matches = regex.findall(line)
for someDate in matches:
 
Empty lines are supposed to make code more readable. The above empty
line does the contrary by separating the block controlled by the for
and the for statement
dayNum = formatDatePart(someDate[1:3])
monthNum = formatDatePart(month[someDate[4:7]])
yearNum = formatDatePart(someDate[8:12])
You don't need the formatDatePart function at all:
newDate = ,%4s-%02d-%2s, % \
(someDate[8:12],month[someDate[4:7]],someDate[1:3])
 
newDate = ,%s-%s-%s, % (yearNum,monthNum,dayNum)
line = line.replace(someDate, newDate)
 
outfile.writelines(line)
 
 infile.close
 outfile.close
You have not read the answers given to the OP, have you. Because if you 
had, your code would be:
infile.close()
outfile.close()
The reason your version seems to be working, is that you probably 
execute your code from the command-line and exiting from Python to 
command-line closes the files, even if you don't.

Cheers,
Jussi
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fwd: Python new user question - file writeline error

2007-02-08 Thread Shawn Milo
On 2/8/07, Jussi Salmela [EMAIL PROTECTED] wrote:
 Shawn Milo kirjoitti:
  To the list:
 
  I have come up with something that's working fine. However, I'm fairly
  new to Python, so I'd really appreciate any suggestions on how this
  can be made more Pythonic.
 
  Thanks,
  Shawn
 
 
 
 
 
 
  Okay, here's what I have come up with:

 What follows may feel harsh but you asked for it ;)

 
 
  #! /usr/bin/python
 
  import sys
  import re
 
  month
  ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12}
 
  infile=file('TVA-0316','r')
  outfile=file('tmp.out','w')
 
  def formatDatePart(x):
 take a number and transform it into a two-character string,
  zero padded
 If a comment or doc string is misleading one would be better off without
 it entirely:
 take a number: the function can in fact take (at least)
 any base type
 transform it: the function doesn't transform x to anything
 although the name of the variable x is the same
 as the argument x
 two-character string: to a string of at least 2 chars
 zero padded: where left/right???
 x = str(x)
 while len(x)  2:
 x = 0 + x
 You don't need loops for these kind of things. One possibility is to
 replace the whole body with:
 return str(x).zfill(2)
 return x
 
  regex = re.compile(r,\d{2}/[A-Z]{3}/\d{4},)
 
  for line in infile:
 matches = regex.findall(line)
 for someDate in matches:
 
 Empty lines are supposed to make code more readable. The above empty
 line does the contrary by separating the block controlled by the for
 and the for statement
 dayNum = formatDatePart(someDate[1:3])
 monthNum = formatDatePart(month[someDate[4:7]])
 yearNum = formatDatePart(someDate[8:12])
 You don't need the formatDatePart function at all:
 newDate = ,%4s-%02d-%2s, % \
 (someDate[8:12],month[someDate[4:7]],someDate[1:3])
 
 newDate = ,%s-%s-%s, % (yearNum,monthNum,dayNum)
 line = line.replace(someDate, newDate)
 
 outfile.writelines(line)
 
  infile.close
  outfile.close
 You have not read the answers given to the OP, have you. Because if you
 had, your code would be:
 infile.close()
 outfile.close()
 The reason your version seems to be working, is that you probably
 execute your code from the command-line and exiting from Python to
 command-line closes the files, even if you don't.

 Cheers,
 Jussi
 --
 http://mail.python.org/mailman/listinfo/python-list



Jussi,

Thanks for the feedback. I received similar comments on a couple of
those items, and posted a newer version an hour or two ago. I think
the only thing missing there is a friendly blank line after my for
line in infile: statement.

Please let me know if there is anything else.

Shawn
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fwd: Python new user question - file writeline error

2007-02-08 Thread Bruno Desthuilliers
Shawn Milo a écrit :
 To the list:
 
 I have come up with something that's working fine. However, I'm fairly
 new to Python, so I'd really appreciate any suggestions on how this
 can be made more Pythonic.
 
 Thanks,
 Shawn
 
 
 
 
 
 
 Okay, here's what I have come up with:
 
 
 #! /usr/bin/python
 
 import sys
 import re
 
 month 
 ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'OCT':10,'NOV':11,'DEC':12}
  
 
 infile=file('TVA-0316','r')
 outfile=file('tmp.out','w')
 
 def formatDatePart(x):
take a number and transform it into a two-character string,
 zero padded
x = str(x)
while len(x)  2:
x = 0 + x
return x

x = %02d % x


 regex = re.compile(r,\d{2}/[A-Z]{3}/\d{4},)

regexps are not really pythonic - we tend to use them only when we have 
no better option. When it comes to parsing CSV files and/or dates, we do 
have better solution : the csv module and the datetime module

 for line in infile:
matches = regex.findall(line)
for someDate in matches:
 
dayNum = formatDatePart(someDate[1:3])
monthNum = formatDatePart(month[someDate[4:7]])
yearNum = formatDatePart(someDate[8:12])
 
newDate = ,%s-%s-%s, % (yearNum,monthNum,dayNum)
line = line.replace(someDate, newDate)



outfile.writelines(line)
 
 infile.close
 outfile.close

I wonder why some of us took time to answer your first question. You 
obviously forgot to read these answers.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fwd: Python new user question - file writeline error

2007-02-08 Thread James
On Feb 8, 3:26 pm, Bruno Desthuilliers
[EMAIL PROTECTED] wrote:
 Shawn Milo a écrit :



  To the list:

  I have come up with something that's working fine. However, I'm fairly
  new to Python, so I'd really appreciate any suggestions on how this
  can be made more Pythonic.

  Thanks,
  Shawn

  Okay, here's what I have come up with:

  #! /usr/bin/python

  import sys
  import re

  month
  ={'JAN':1,'FEB':2,'MAR':3,'APR':4,'MAY':5,'JUN':6,'JUL':7,'AUG':8,'SEP':9,'­OCT':10,'NOV':11,'DEC':12}

  infile=file('TVA-0316','r')
  outfile=file('tmp.out','w')

  def formatDatePart(x):
 take a number and transform it into a two-character string,
  zero padded
 x = str(x)
 while len(x)  2:
 x = 0 + x
 return x

 x = %02d % x

  regex = re.compile(r,\d{2}/[A-Z]{3}/\d{4},)

 regexps are not really pythonic - we tend to use them only when we have
 no better option. When it comes to parsing CSV files and/or dates, we do
 have better solution : the csv module and the datetime module

  for line in infile:
 matches = regex.findall(line)
 for someDate in matches:

 dayNum = formatDatePart(someDate[1:3])
 monthNum = formatDatePart(month[someDate[4:7]])
 yearNum = formatDatePart(someDate[8:12])

 newDate = ,%s-%s-%s, % (yearNum,monthNum,dayNum)
 line = line.replace(someDate, newDate)
 outfile.writelines(line)

  infile.close
  outfile.close

 I wonder why some of us took time to answer your first question. You
 obviously forgot to read these answers.

No offense - but the fact that 're' module is available, doesn't that
mean we can use it? (Pythonic or not - not sure what is really
pythonic at this stage of learning...)
Like Perl, I'm sure there are more than one way to solve problems in
Python.

I appreciate everyone's feedback - I definitely got more than
expected, but it feels comforting that people do care about writing
better codes! :)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fwd: Python new user question - file writeline error

2007-02-08 Thread Gabriel Genellina
En Thu, 08 Feb 2007 14:20:57 -0300, Shawn Milo [EMAIL PROTECTED]  
escribió:

 On 8 Feb 2007 09:05:51 -0800, Gabriel Genellina [EMAIL PROTECTED]  
 wrote:
 On 8 feb, 12:41, Shawn Milo [EMAIL PROTECTED] wrote:

  I have come up with something that's working fine. However, I'm fairly
  new to Python, so I'd really appreciate any suggestions on how this
  can be made more Pythonic.

 A few comments:

 You don't need the formatDatePart function; delete it, and replace
 newDate = ,%s-%s-%s, % (yearNum,monthNum,dayNum)
 with
 newDate = ,%04.4d-%02.2d-%02.2d, % (yearNum,monthNum,dayNum)

 and before:
 dayNum, monthNum, yearNum = [int(num) for num in
 someDate[1:-1].split('/')]

 And this: outfile.writelines(line)
 should be: outfile.write(line)
 (writelines works almost by accident here).

 You forget again to use () to call the close methods:
 infile.close()
 outfile.close()

 I don't like the final replace, but for a script like this I think
 it's OK.

 --
 Gabriel Genellina

 --
 http://mail.python.org/mailman/listinfo/python-list



 Gabriel,

 Thanks for the comments! The new version is below. I thought it made a
 little more sense to format the newDate = ... line the way I have it
 below, although I did incorporate your suggestions.

Looks pretty good for me!
Just one little thing I would change, the variables monthNum, dayNum etc.;  
the suffix might indicate that they're numbers, but they're strings  
instead. So I would move the int(...) a few lines above, where the  
variables are defined.
But that's just a cosmetic thing and just a matter of taste.

 Also, the
 formatting options you provided seemed to specify not only string
 padding, but also decimal places, so I changed it. Please let me know
 if there is some other meaning behind the way you did it.

No, it has no meaning, at least for this range of values.

 As for not liking the replace line, what would you suggest instead?

You already have scanned the line to find the matching fragment; the match  
object knows exactly where it begins and ends; so one could replace it  
with the reformatted value without searching again, wich takes some more  
time, at least in principle.
But this makes the code a bit more complex, and it would only make sense  
if you were to process millions of lines, and even then, the execution  
might be I/O-bound so you would gain nothing at the end.
That's why I think it's OK as it is now.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list