On 3/25/10 8:41 AM, Dr. Benjamin David Clarke wrote:
Does anyone know of a way to save the a loaded web page to file after
opening it with a webbrowser.open() call?

Specifically, what I want to do is get the raw HTML from a web page.
This web page uses Javascript. I need the resulting HTML after the
Javascript has been run. I've seen a lot about trying to get Python to
run Javascript but there doesn't seem to be any promising solution. I
can get the raw HTML that I want by saving the page after it has been
loaded via the webbrowser.open() call. Is there any way to automate
this? Does anyone have any ideas for better approaches to this
problem? I don't need ti to be pretty or anything.

I think I would use an appropriate GUI automation library to simulate user interaction with the web browser that you just started, and e.g. select the File > Save page as > HTML only menu option from the browser...

If the javascript heavily modifies the DOM, that might not work however. You might need additional tooling such as Web Developer Toolbar for Firefox where you then can View Source > View Generated Source.

irmen

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to