I have a Java app that works as an HTTP proxy which I'm trying to
integrate with Gecko through the JavaXPCom to utilize its DOM
rendering abilities to enable server-side DOM rendering/creation/
serialization for scrape purposes.

The problem I'm trying to solve with this is to enable server-side DOM
creation.  For example, if DOM of some page is entirely javascript
built, accessing this page through plain Java APIs will get me the
markup of that page with javascript in there. However, I currently
have no way of simulating running/opening/rendering that page server-
side so that javascript could build the rest of the page's DOM and
then serialize that back.

The end result is that this new implementation would enable the HTTP
proxy to return a completely generated DOM after any/all Javascript
has run in that page.

I've looked at Crowbar and this tells me that this sort of
functionality is possible with xpcom but I need to do it in Java and
not JS/XUL.  One limitation I found with Crowbar for me is that it
doesn't handle content-types at the moment and non hmtl/xhtml
responses from target sites its proxying to (such as images).

Basically this is what I have right now as part of my Java proxy:
1. Have access to the URL I need to access on the target site
2. Have all URL params that I need to send through (proxy through to
the target site)
3. Have all HTTP headers that I need to send through

At the minimum I'd like to tell a JavaXPCom component to just render
and create a complete DOM out of the HTML markup (so that it runs
javascript and creates DOM) and then give me that as a String back.

Ultimately, I'd like to use JavaXPCom to get the Gecko engine to do
all the transport layer functionality (make the HTTP request, handle
the response) and give me back the complete DOM with all the response
headers.

Note that my app runs as a server-side headless component.

Thanks for any help.
_______________________________________________
dev-embedding mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-embedding

Reply via email to