I understand that the web is full of ill-formed XHTML web pages but this is Microsoft:
http://moneycentral.msn.com/companyreport?Symbol=BBBY I can't validate it and xml.minidom.dom.parseString won't work on it. If this was just some teenager's web site I'd move on. Is there any hope avoiding regular expression hacks to extract the data from this page? Chris -- http://mail.python.org/mailman/listinfo/python-list