I recently built Mozilla 1.6 with Perl XPCOM (plxpcom).
Now I want to see if I can use XPCOM to access the DOM
of an HTML document loaded into a running instance of Mozilla.
I can see that you can access XPCOM components from javascript
within a browser; these components could either be ones that
come with Mozilla, such as the Cookie Manager, or ones that
are created and installed into Mozilla, for example using plxpcom.
My ultimate goal, however, is to be able to "drive" a browser
from a separate script. I would like to be able to load a page,
fill in a form (including selecting items in a <select> list,
checking checkboxes, etc.), and submit it. It's possible to do
something like this in Perl using a module called WWW::Mechanize.
However, this module doesn't do javascript. There is another similar
module, called Win32::IE::Mechanize, which uses OLE automation of
Internet Explorer, and this does handle javascript; however,
since I primarily use Linux, on which IE doesn't run [1], I would
like to be able to "mechanize" Mozilla.
This is where things get fuzzy. Can a script access the "live"
components in a separately-running Mozilla? I tried creating a
simple plxpcom script [2] to access the Cookie Manager, but it seemed
to not return any cookies. Assuming this isn't a plxpcom bug, I'm
guessing this is because my script doesn't live within a running
Mozilla, so this "application" doesn't have any cookies saved.
I probably need to use for example gtkmozembed to embed Gecko
into an application.
If that's true, there is another Perl module, Gtk2::MozEmbed,
which is a wrapper around gtkmozembed. It has an example script
showing how create a simple browser. However, this module doesn't
provide a way to access the DOM. It doesn't wrap
gtk_moz_embed_get_nsIWebBrowser. So I wonder if I can use Perl
XPCOM within a Gtk2::MozEmbed application in order get to the DOM,
but I don't see how to connect the two together. Does anyone know
of a tutorial or explanation of this (even in C++)? In particular,
an explained example of a minimal gtkmozembed browser that accesses
the DOM would be great.
So far, I've noticed that plxpcom does this when it's loaded:
res = NS_InitXPCOM2(nsnull, xpDir, nsnull);
// If we failed, chances are good that init was already called.
if(NS_SUCCEEDED(res)) {
atexit(XPCOMShutdown);
}
This seems to imply that if XPCOM has already been initialized,
it will use the existing XPCOM "instance". Then is that all that's
needed for it to be "connected" to the browser it's running in?
(Hmm, I just realized I'll have to rebuild Gtk2::MozEmbed
for my Mozilla 1.6 build. But I think Gtk2::MozEmbed requires
gtkmozembed >= 1.7. Ugh...)
____
[1] Technically, you can run IE on Linux using WINE. I also installed
ActiveState Perl under WINE, so it's possible I could use
Win32::IE::Mechanize on Linux. At the least, this solution is
kind of ugly, though.
[2]
#!/usr/bin/perl
# This script must be invoked with run-mozilla.sh so that the
# correct environment variables are set:
# ../../src/mozilla-source-1.6/dist/bin/run-mozilla.sh ./get-cookies.pl
use XPCOM;
# The following is equivalent to this javascript:
# var cookieManager = Components.classes["@mozilla.org/cookiemanager;1"]
# .getService(Components.interfaces.nsICookieManager);
# var iter = cookieManager.enumerator;
# while (iter.hasMoreElements()){
# var cookie = iter.getNext();
# ....
my $contract_id = '@mozilla.org/cookiemanager;1';
my $interface_name = 'nsICookieManager';
my $interface = $Components::interfaces{$interface_name};
if (defined $interface) {
my $cookie_manager = Components::GetService($contract_id, $interface);
if (NS_SUCCEEDED) {
my $enum = $cookie_manager->enumerator;
if (NS_SUCCEEDED) {
while ($enum->hasMoreElements()) {
my $cookie = $enum->getNext();
if (NS_SUCCEEDED) {
print "host: ", $cookie->host;
} else {
die "failed to get cookie\n";
}
}
} else {
die "failed to get cookie manager enumerator\n";
}
} else {
if ($Components::returnCode ==
Components::results::NS_ERROR_NO_INTERFACE) {
die "'$contract_id' has no '$interface_name' interface\n";
} else {
# XXX: is there a way to get a string representation of
# the returnCode ?
die "GetService failed!\n";
}
}
} else {
die "interface '$interface_name' doesn't exist\n";
}
_______________________________________________
Mozilla-xpcom mailing list
[email protected]
http://mail.mozilla.org/listinfo/mozilla-xpcom