if I have a string such as '<td>01/12/2011</td>' and i want
to reformat it as '20110112', how do i pull out the components
of the string and reformat them into a YYYYDDMM format?
I have:
import re
test = re.compile('dd/')
f = open('test.html') # This file contains the html dates
for line in f:
if test.search(line):
# I need to pull the date components here
I am no python guru but you could use beautifulsoup to parse html as its
much easier
some untested pseudocode below. adapt to your needs.
from BeautifulSoup import BeautifulSoup
#read html data or whatever source
html_data = open('/yourwebsite/page.html','r').read()
#Create the soup object from the HTML data
soup = new BeautifulSoup(html_data)
someData = soup.find('td',name='someTable')
#Find the proper tag see beautifulsoup docs
value = someData.attrs[2][1] # the value of 3rd attrib of the tag , just
an example
##end
now when you have the date in some str format the next thing is your date
conversion. For this
re fer to dateutil parse http://labix.org/python-dateutil
hope it help.
----------------------------
posted via Grepler.com -- poster is authenticated.
begin 644
end
--
http://mail.python.org/mailman/listinfo/python-list