-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 24/03/2012 15:53, valdis.kletni...@vt.edu wrote:
> On Sat, 24 Mar 2012 10:26:48 -0000, Dave said:
> 
>> Doesn't the the -e, robots=off, --page-requisites and -H wget directives 
>> enable
>> one to collect all the necessary files that are called from a page?
> 
> No, not *all* the files, for the same reason that if you visit a page with
> NoScript enabled, you may end up with missing content and/or big open spaces 
> on
> the page.
> 
> Consider a page that has Javascript on it:
> 
> todaysfile = "http://www.news-site.com/"; + date_as_string;
> document.load(todaysfile);
> 
> Unless you interpret the javascript, you don't know what URL will get loaded,
> because yesterday and tomorrow will get a different URL.  So basically,
> if you try to pull it down with wget or similar, you will miss *all* the stuff
> that's pulled down via Javascript (and probably via css as well - does wget
> know how to follow CSS references?).  On many modern web designs,
> this ends up being the vast majority of the content.

Thanks Valdis,

Some things are pretty obvious when pointed out.

Dave
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEVAwUBT24RNLIvn8UFHWSmAQLkiwgA2Zkc9GzAeOyeqQAxUGonPf3FoGDOP3ym
QATyh9MRRZVmEP2Bz9B1V7r68XP1aw6NjCDgWgs0di+z/tzd4eRFQfkKvEF+f4Ri
WsO/ywygxps/5UVIl4yo3whpczeza1yLJuOhC5AT3gcxk/Q5Vv/Cm409Wi8uul4S
acgm3wZvv1O5V2VpLUjTt4ucLuH+iKeMQRQOO+qcKHMkL7wtxajrLzKlEd343eaz
aq52jZ1xF1i7V632dvE2Cr2ipNv5sguKHHG26GfBpAjPSLlvtmO7lGQ3PQydUGXY
PDYamLbP2WyTas2Yf1jYoVdo11d3HSu8E39xiQOj02eM84lUesCoxQ==
=M8iL
-----END PGP SIGNATURE-----

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/

Reply via email to