On 4 Feb 2005 15:33:50 -0800, Mudcat <[EMAIL PROTECTED]> wrote: > Hi, > > I'm wondering the best way to do the following. > > I would like to use a map webpage (like yahoo maps) to find the > distance between two places that are pulled in from a text file. I want > to accomplish this without displaying the browser.
That's called "web scraping", in case you want to Google for info. > I am looking at several options right now, including urllib, httplib, > packet trace, etc. But I don't know where to start with it or if there > are existing tools that I could incorporate. > > Can someone explain how to do this or point me in the right direction? I did it this way successfully once ... it's probably the wrong approach in some ways, but It Works For Me. - used httplib.HTTPConnection for the HTTP parts, building my own requests with headers and all, calling h.send() and h.getresponse() etc. - created my own cookie container class (because there was a session involved, and logging in and such things, and all of it used cookies) - subclassed sgmllib.SGMLParser once for each kind of page I expected to receive. This class knew how to pull the information from a HTML document, provided it looked as I expected it to. Very tedious work. It can be easier and safer to just use module re in some cases. Wrapped in classes this ended up as (fictive): client = Client('somehost:80) client.login('me', 'secret) a, b = theAsAndBs(client, 'tomorrow', 'Wiltshire') foo = theFoo(client, 'yesterday') I had to look deeply into the HTTP RFCs to do this, and also snoop the traffic for a "real" session to see what went on between server and client. /Jorgen -- // Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu \X/ algonet.se> R'lyeh wgah'nagl fhtagn! -- http://mail.python.org/mailman/listinfo/python-list