Ive been looking for way to scrape the data from a html table, but dont know even where to start, or how to do..
an example can be found here of the table ( http://www.dragon256.plus.com/timer.html ) - i'd like to extract all the data except for the delete column and then just print each row..
Use module urllib2 for obtaining the page source:
import urllib2
page = urllib2.urlopen("http://www.dragon256.plus.com/timer.html")
html = page.readlines()
You now have a list of lines.
Now you can use any number of string parsing tools to locate lines starting with <tr> to find each new row, then <td> to find each cell, then search past the tag(s) to find the cell text.
You have 3 cases to deal with:
<td class='normal' align='left'><a href=''>Glastonbury 2005</a></td>
<td class='normal' align='left'>BBC THREE</td> <td class='normal' align='middle'><input type='checkbox' ></td>Is that enough to get you started?
mailto:[EMAIL PROTECTED]
510 558 3275 home
720 938 2625 cell
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor