Re: Strange thing with types
TYR wrote: I'm doing some data normalisation, which involves data from a Web site being extracted with BeautifulSoup, cleaned up with a regex, then having the current year as returned by time()'s tm_year attribute inserted, before the data is concatenated with string.join() and fed to time.strptime(). Here's some code: timeinput = re.split('[\s:-]', rawtime) print timeinput #trace statement print year #trace statement t = timeinput.insert(2, year) print t #trace statement t1 = string.join(t, '') timeobject = time.strptime(t1, %d %b %Y %H %M) year is a Unicode string; so is the data in rawtime (BeautifulSoup gives you Unicode, dammit). And here's the output: [u'29', u'May', u'01', u'00'] (OK, so the regex is working) 2008 (OK, so the year is a year) None (...but what's this?) Traceback (most recent call last): File bothv2.py, line 71, in module t1 = string.join(t, '') File /usr/lib/python2.5/string.py, line 316, in join return sep.join(words) TypeError First - don't use module string anymore. Use e.g. ''.join(t) Second, you can only join strings. but year is an integer. So convert it to a string first: t = timeinput.insert(2, str(year)) Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: Strange thing with types
On May 29, 11:09 pm, TYR [EMAIL PROTECTED] wrote: I'm doing some data normalisation, which involves data from a Web site being extracted with BeautifulSoup, cleaned up with a regex, then having the current year as returned by time()'s tm_year attribute inserted, before the data is concatenated with string.join() and fed to time.strptime(). Here's some code: timeinput = re.split('[\s:-]', rawtime) print timeinput #trace statement print year #trace statement t = timeinput.insert(2, year) print t #trace statement t1 = string.join(t, '') timeobject = time.strptime(t1, %d %b %Y %H %M) year is a Unicode string; so is the data in rawtime (BeautifulSoup gives you Unicode, dammit). And here's the output: [u'29', u'May', u'01', u'00'] (OK, so the regex is working) 2008 (OK, so the year is a year) None (...but what's this?) Traceback (most recent call last): File bothv2.py, line 71, in module t1 = string.join(t, '') File /usr/lib/python2.5/string.py, line 316, in join return sep.join(words) TypeError list.insert modifies the list in-place: l = [1,2,3] l.insert(2,4) l [1, 2, 4, 3] It also returns None, which is what you're assigning to 't' and then trying to join. Replace your usage of 't' with 'timeinput' and it should work. -- http://mail.python.org/mailman/listinfo/python-list
Strange thing with types
I'm doing some data normalisation, which involves data from a Web site being extracted with BeautifulSoup, cleaned up with a regex, then having the current year as returned by time()'s tm_year attribute inserted, before the data is concatenated with string.join() and fed to time.strptime(). Here's some code: timeinput = re.split('[\s:-]', rawtime) print timeinput #trace statement print year #trace statement t = timeinput.insert(2, year) print t #trace statement t1 = string.join(t, '') timeobject = time.strptime(t1, %d %b %Y %H %M) year is a Unicode string; so is the data in rawtime (BeautifulSoup gives you Unicode, dammit). And here's the output: [u'29', u'May', u'01', u'00'] (OK, so the regex is working) 2008 (OK, so the year is a year) None (...but what's this?) Traceback (most recent call last): File bothv2.py, line 71, in module t1 = string.join(t, '') File /usr/lib/python2.5/string.py, line 316, in join return sep.join(words) TypeError -- http://mail.python.org/mailman/listinfo/python-list
Re: Strange thing with types
On May 29, 2:23 pm, Diez B. Roggisch [EMAIL PROTECTED] wrote: TYR wrote: I'm doing some data normalisation, which involves data from a Web site being extracted with BeautifulSoup, cleaned up with a regex, then having the current year as returned by time()'s tm_year attribute inserted, before the data is concatenated with string.join() and fed to time.strptime(). Here's some code: timeinput = re.split('[\s:-]', rawtime) print timeinput #trace statement print year #trace statement t = timeinput.insert(2, year) print t #trace statement t1 = string.join(t, '') timeobject = time.strptime(t1, %d %b %Y %H %M) year is a Unicode string; so is the data in rawtime (BeautifulSoup gives you Unicode, dammit). And here's the output: [u'29', u'May', u'01', u'00'] (OK, so the regex is working) 2008 (OK, so the year is a year) None (...but what's this?) Traceback (most recent call last): File bothv2.py, line 71, in module t1 = string.join(t, '') File /usr/lib/python2.5/string.py, line 316, in join return sep.join(words) TypeError First - don't use module string anymore. Use e.g. ''.join(t) Second, you can only join strings. but year is an integer. So convert it to a string first: t = timeinput.insert(2, str(year)) Diez Yes, tm_year is converted to a unicode string elsewhere in the program. -- http://mail.python.org/mailman/listinfo/python-list
Re: Strange thing with types
On May 29, 2:24 pm, alex23 [EMAIL PROTECTED] wrote: On May 29, 11:09 pm, TYR [EMAIL PROTECTED] wrote: I'm doing some data normalisation, which involves data from a Web site being extracted with BeautifulSoup, cleaned up with a regex, then having the current year as returned by time()'s tm_year attribute inserted, before the data is concatenated with string.join() and fed to time.strptime(). Here's some code: timeinput = re.split('[\s:-]', rawtime) print timeinput #trace statement print year #trace statement t = timeinput.insert(2, year) print t #trace statement t1 = string.join(t, '') timeobject = time.strptime(t1, %d %b %Y %H %M) year is a Unicode string; so is the data in rawtime (BeautifulSoup gives you Unicode, dammit). And here's the output: [u'29', u'May', u'01', u'00'] (OK, so the regex is working) 2008 (OK, so the year is a year) None (...but what's this?) Traceback (most recent call last): File bothv2.py, line 71, in module t1 = string.join(t, '') File /usr/lib/python2.5/string.py, line 316, in join return sep.join(words) TypeError list.insert modifies the list in-place: l = [1,2,3] l.insert(2,4) l [1, 2, 4, 3] It also returns None, which is what you're assigning to 't' and then trying to join. Replace your usage of 't' with 'timeinput' and it should work. Thank you. -- http://mail.python.org/mailman/listinfo/python-list