Piet van Oostrum wrote:
<snip>
<snip>
DA> But the raw page didn't have any javascript. So what about that original
DA> raw page triggered additional stuff to be loaded?
DA> Is it "user agent", as someone else brought out? And is there somewhere I
DA> can read more about that aspect of things? I've mostly built very static
DA> html pages, where the server yields the same page to everybody. And some
DA> form stuff, where the user clicks on a 'submit" button to trigger a script
DA> that's not shown on the URL line.
Yes, if you specify a 'normal' web browser as user agent you do get the
Javascript:
import urllib2
request =
urllib2.Request('http://www.marketwatch.com/story/mondays-biggest-gaining-and-declining-stocks-2009-07-27')
request.add_header('User-Agent', 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X
10.5; en-US; rv:1.9.0.13) Gecko/2009073021 Firefox/3.0.13')
opener = urllib2.build_opener()
page = opener.open(request).read()
print page
Thanks much. That's a key I didn't understand.
DaveA
--
http://mail.python.org/mailman/listinfo/python-list