BZ> Note that if you do what the view-source window does in Mozilla to
load
BZ> Mozilla's built-in view-source into a display:none docshell or
something
BZ> and then grab the text out of that, it'll be about what you want
(though
BZ> characters, not bytes, and not as fast as just getting the bytes).
This
BZ> preserves whitespace, all sorts of malformed stuff, etc.

EB> Thanks, I'll try that approach, but I need more specifics on "grab
EB> the text out of that".  I have an nsIDocShell that has been loaded
EB> in the same manner as the view-source window.  Now, how do I get
the
EB> text out of it?

EB> I've done a bit of research.

EB> The trail starts at
EB>
<http://lxr.mozilla.org/mozilla/source/browser/base/content/browser.js#1820>.
EB>

[...]

EB> I think this adequately describes how ViewSource differs from a
EB> regular page load, but I still don't see how to get the text.
EB> Boris?  Help?

Ah, I think I see what you mean.  You mean use my existing technique
where I get the DOM text, but do it on a view-source: view of the page.
This gave me the idea to take the following very simple approach, which
I can implement mostly in Java.

1. make the necessary changes so I can reload the current page from the
cache into a new, non displaying, BrowserControl with the "view-soure:"
pre-pended.

2. do CurrentPage.getSelection().toString() to get the source!

I tried this general approach in the debugger (NetBeans) and it worked.
Now to code it up!

Thanks for your ideas,

Ed

_______________________________________________
mozilla-embedding mailing list
[email protected]
http://mail.mozilla.org/listinfo/mozilla-embedding

Reply via email to