I am trying to parse a webpage and extract information. I am trying to use pyparser. Here is what I have:
from pyparsing import * import urllib # define basic text pattern spanStart = Literal('<span class=\"hpPageText\">') spanEnd = Literal('</span></td>') printCount = spanStart + SkipTo(spanEnd) + spanEnd # get printer addresses printerURL = "http://printer.mydomain.com/hp/device/this.LCDispatcher? nav=hp.Usage" printerListPage = urllib.urlopen(printerURL) printerListHTML = printerListPage.read() printerListPage.close for srvrtokens,startloc,endloc in printCount.scanString(printerListHTML): print srvrtokens print printCount I have the last print statement to check what is being sent because I am getting nothing back. What it sends is: {"<span class="hpPageText">" SkipTo:("</span></td>") "</span></td>"} If I pull out the "hpPageText" I get results back, but more than what I want. I know it has something to do with escaping the quotation marks, but I am puzzled as to how to do it. Thanks, Mike -- http://mail.python.org/mailman/listinfo/python-list