Hi. Looking to parse some web pages that have javascript (jquery) embedded/used in the pages. I'm trying to get a better understanding of exactly how the page is generated, and displayed in the browser.
I've seen various references to python-spidermonkey, as well as watir/firewatir. Is there a way to accomplish fetching text from javascript generated pages? It appears that the ability to "call" firefox using "jssh" could allow me to return the complete page of the displayed app. I'm not sure if this is pythonic!! I suspect that I would have to somehow invoke the page, using firefox/jssh, (or spidermonkey) or some other javascript engine, and then somehow invoke the javascript function, that would fill in the 'div' within the page... Is this even doable??? It would be great if there was someway of calling an external browser/app that one could pass the targeted url, and get back the resulting html that's displayed by the browser!! A target site is http:://web-app.usc.edu/soc/term_20091.html where the 'dept' list is completely generated by javascript functions... When i researched this awhile ago, there didn't appear to be a really good solution to this situation. I'm curious if someone knows of a solution to this issue that's now available and that works! Thanks for any thoughts/comments in this issue... -bruce -- http://mail.python.org/mailman/listinfo/python-list