Re: difference between urllib2.urlopen and firefox view 'page source'?

Tina I Mon, 19 Mar 2007 23:01:11 -0800

cjl wrote:
> Hi.
> 
> I am trying to screen scrape some stock data from yahoo, so I am
> trying to use urllib2 to retrieve the html and beautiful soup for the
> parsing.
> 
> Maybe (most likely) I am doing something wrong, but when I use
> urllib2.urlopen to fetch a page, and when I view 'page source' of the
> exact same URL in firefox, I am seeing slight differences in the raw
> html.
> 
> Do I need to set a browser agent so yahoo thinks urllib2 is firefox?
> Is yahoo detecting that urllib2 doesn't process javascript, and
> passing different data?
> 
> -cjl
> 
Unless the data you you need depends on the site detecting a specific 
browser you will probably receive a 'cleaner' code that's more easily 
parsed if you don't set a user agent. Usually the browser optimization 
they do is just eye candy, bells and whistles anyway in order to give 
you a more 'pleasing experience'. I doubt that your program will care 
about that ;)


Tina
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: difference between urllib2.urlopen and firefox view 'page source'?

Reply via email to