Re: Strange thing with types

2008-05-29 Thread Diez B. Roggisch
TYR wrote:

 I'm doing some data normalisation, which involves data from a Web site
 being extracted with BeautifulSoup, cleaned up with a regex, then
 having the current year as returned by time()'s tm_year attribute
 inserted, before the data is concatenated with string.join() and fed
 to time.strptime().
 
 Here's some code:
 timeinput = re.split('[\s:-]', rawtime)
 print timeinput #trace statement
 print year #trace statement
 t = timeinput.insert(2, year)
 print t #trace statement
 t1 = string.join(t, '')
 timeobject = time.strptime(t1, %d %b %Y %H %M)
 
 year is a Unicode string; so is the data in rawtime (BeautifulSoup
 gives you Unicode, dammit). And here's the output:
 
 [u'29', u'May', u'01', u'00'] (OK, so the regex is working)
 2008 (OK, so the year is a year)
 None (...but what's this?)
 Traceback (most recent call last):
   File bothv2.py, line 71, in module
 t1 = string.join(t, '')
   File /usr/lib/python2.5/string.py, line 316, in join
 return sep.join(words)
 TypeError

First - don't use module string anymore. Use e.g.

''.join(t)

Second, you can only join strings. but year is an integer. So convert it to
a string first:

t = timeinput.insert(2, str(year))

Diez
--
http://mail.python.org/mailman/listinfo/python-list


Re: Strange thing with types

2008-05-29 Thread alex23
On May 29, 11:09 pm, TYR [EMAIL PROTECTED] wrote:
 I'm doing some data normalisation, which involves data from a Web site
 being extracted with BeautifulSoup, cleaned up with a regex, then
 having the current year as returned by time()'s tm_year attribute
 inserted, before the data is concatenated with string.join() and fed
 to time.strptime().

 Here's some code:
 timeinput = re.split('[\s:-]', rawtime)
 print timeinput #trace statement
 print year #trace statement
 t = timeinput.insert(2, year)
 print t #trace statement
 t1 = string.join(t, '')
 timeobject = time.strptime(t1, %d %b %Y %H %M)

 year is a Unicode string; so is the data in rawtime (BeautifulSoup
 gives you Unicode, dammit). And here's the output:

 [u'29', u'May', u'01', u'00'] (OK, so the regex is working)
 2008 (OK, so the year is a year)
 None (...but what's this?)
 Traceback (most recent call last):
   File bothv2.py, line 71, in module
 t1 = string.join(t, '')
   File /usr/lib/python2.5/string.py, line 316, in join
 return sep.join(words)
 TypeError

list.insert modifies the list in-place:

 l = [1,2,3]
 l.insert(2,4)
 l
[1, 2, 4, 3]

It also returns None, which is what you're assigning to 't' and then
trying to join.

Replace your usage of 't' with 'timeinput' and it should work.
--
http://mail.python.org/mailman/listinfo/python-list


Strange thing with types

2008-05-29 Thread TYR
I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().

Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, %d %b %Y %H %M)

year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:

[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
  File bothv2.py, line 71, in module
t1 = string.join(t, '')
  File /usr/lib/python2.5/string.py, line 316, in join
return sep.join(words)
TypeError
--
http://mail.python.org/mailman/listinfo/python-list


Re: Strange thing with types

2008-05-29 Thread TYR
On May 29, 2:23 pm, Diez B. Roggisch [EMAIL PROTECTED] wrote:
 TYR wrote:
  I'm doing some data normalisation, which involves data from a Web site
  being extracted with BeautifulSoup, cleaned up with a regex, then
  having the current year as returned by time()'s tm_year attribute
  inserted, before the data is concatenated with string.join() and fed
  to time.strptime().

  Here's some code:
  timeinput = re.split('[\s:-]', rawtime)
  print timeinput #trace statement
  print year #trace statement
  t = timeinput.insert(2, year)
  print t #trace statement
  t1 = string.join(t, '')
  timeobject = time.strptime(t1, %d %b %Y %H %M)

  year is a Unicode string; so is the data in rawtime (BeautifulSoup
  gives you Unicode, dammit). And here's the output:

  [u'29', u'May', u'01', u'00'] (OK, so the regex is working)
  2008 (OK, so the year is a year)
  None (...but what's this?)
  Traceback (most recent call last):
File bothv2.py, line 71, in module
  t1 = string.join(t, '')
File /usr/lib/python2.5/string.py, line 316, in join
  return sep.join(words)
  TypeError

 First - don't use module string anymore. Use e.g.

 ''.join(t)

 Second, you can only join strings. but year is an integer. So convert it to
 a string first:

 t = timeinput.insert(2, str(year))

 Diez

Yes, tm_year is converted to a unicode string elsewhere in the program.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Strange thing with types

2008-05-29 Thread TYR
On May 29, 2:24 pm, alex23 [EMAIL PROTECTED] wrote:
 On May 29, 11:09 pm, TYR [EMAIL PROTECTED] wrote:



  I'm doing some data normalisation, which involves data from a Web site
  being extracted with BeautifulSoup, cleaned up with a regex, then
  having the current year as returned by time()'s tm_year attribute
  inserted, before the data is concatenated with string.join() and fed
  to time.strptime().

  Here's some code:
  timeinput = re.split('[\s:-]', rawtime)
  print timeinput #trace statement
  print year #trace statement
  t = timeinput.insert(2, year)
  print t #trace statement
  t1 = string.join(t, '')
  timeobject = time.strptime(t1, %d %b %Y %H %M)

  year is a Unicode string; so is the data in rawtime (BeautifulSoup
  gives you Unicode, dammit). And here's the output:

  [u'29', u'May', u'01', u'00'] (OK, so the regex is working)
  2008 (OK, so the year is a year)
  None (...but what's this?)
  Traceback (most recent call last):
File bothv2.py, line 71, in module
  t1 = string.join(t, '')
File /usr/lib/python2.5/string.py, line 316, in join
  return sep.join(words)
  TypeError

 list.insert modifies the list in-place:

  l = [1,2,3]
  l.insert(2,4)
  l

 [1, 2, 4, 3]

 It also returns None, which is what you're assigning to 't' and then
 trying to join.

 Replace your usage of 't' with 'timeinput' and it should work.

Thank you.
--
http://mail.python.org/mailman/listinfo/python-list