RE: Wget - relative links within a script call aren't followed

2004-03-15 Thread Herold Heiko
No way, sorry.
wget does not support javascript, so there is no way to have it follow that
kind of links.
Heiko

-- 
-- PREVINET S.p.A. www.previnet.it
-- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED]
-- +39-041-5907073 ph
-- +39-041-5907472 fax

> -Original Message-
> From: Raydeen A. Gallogly [mailto:[EMAIL PROTECTED]
> Sent: Friday, March 12, 2004 4:20 PM
> To: [EMAIL PROTECTED]
> Subject: Wget - relative links within a script call aren't followed
> 
> 
> I'm new to Wget but have learned alot in the last week.  We are
> successfully running Wget to mirror a website existing on the 
> other side of
> a firewall within our own agency.  We can retrieve all 
> relative links from
> existing HTML files with the exception of those that are 
> contained within a
> script.
> 
> For example, this is an excerpt from a script call to load an 
> image within
> an HTML document that is not being followed:
> MM_preloadImages('pix/lats_but_lite.gif',)
> 
> The only fix to this problem so far that we have been able to 
> implement is
> to have the webmaster on the site that we want to mirror 
> create a small
> HTML file named 'wgetfixes.html', link to it from the home 
> page using style
> (display:none;) so that users won't see.  Within the file, 
> list all the
> files that they are calling from within their scripts 
> individually using
> the following syntax:  -- 
> this works fine
> but I'm hopeful that there is a better way using a switch within Wget.
> 
> Thanks for any input, it is truly appreciated.  - Raydeen
> 
> ..
> 
> 
> Raydeen Gallogly
> Web Manager
> NYS Department of Health, Wadsworth Center
> http://www.wadsworth.org
> email: [EMAIL PROTECTED]
> 
> 
> 
> 
> 
> 
> 
> 


RE: Wget - relative links within a script call aren't followed

2004-03-15 Thread Fred Holmes
It surely would be nice if some day WGET could support javascript.  Is that something 
to put on the "wish list" or is it substantially impossible to implement?  Do folks 
use Java to load images in order to thwart 'bots such as WGET?

I run into the same problem regularly, and simply create a series of lines in a batch 
file that download each of the images by explicit filename.  Very doable, but requires 
manual setup, rather than having WGET automatically follow the links.  This will test 
for/download files that are known to ought to be there, but won't find files that are 
newly added.

Thanks,

Fred Holmes

At 05:07 AM 3/15/2004, Herold Heiko wrote:
>No way, sorry.
>wget does not support javascript, so there is no way to have it follow that
>kind of links.
>Heiko
>
>-- 
>-- PREVINET S.p.A. www.previnet.it
>-- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED]
>-- +39-041-5907073 ph
>-- +39-041-5907472 fax
>
>> -Original Message-
>> From: Raydeen A. Gallogly [mailto:[EMAIL PROTECTED]
>> Sent: Friday, March 12, 2004 4:20 PM
>> To: [EMAIL PROTECTED]
>> Subject: Wget - relative links within a script call aren't followed
>> 
>> 
>> I'm new to Wget but have learned alot in the last week.  We are
>> successfully running Wget to mirror a website existing on the 
>> other side of
>> a firewall within our own agency.  We can retrieve all 
>> relative links from
>> existing HTML files with the exception of those that are 
>> contained within a
>> script.
>> 
>> For example, this is an excerpt from a script call to load an 
>> image within
>> an HTML document that is not being followed:
>> MM_preloadImages('pix/lats_but_lite.gif',)
>> 
>> The only fix to this problem so far that we have been able to 
>> implement is
>> to have the webmaster on the site that we want to mirror 
>> create a small
>> HTML file named 'wgetfixes.html', link to it from the home 
>> page using style
>> (display:none;) so that users won't see.  Within the file, 
>> list all the
>> files that they are calling from within their scripts 
>> individually using
>> the following syntax:  -- 
>> this works fine
>> but I'm hopeful that there is a better way using a switch within Wget.
>> 
>> Thanks for any input, it is truly appreciated.  - Raydeen
>> 
>> ..
>> 
>> 
>> Raydeen Gallogly
>> Web Manager
>> NYS Department of Health, Wadsworth Center
>> http://www.wadsworth.org
>> email: [EMAIL PROTECTED]
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 



RE: Wget - relative links within a script call aren't followed

2004-03-15 Thread Herold Heiko
This has been discusses several times in the past, for a complete solution
LOT of work would be needed (a complete javascript engine would be
neccessary for a starter), also there are several semantic problems (for
example if a pic is laded only during mouseover, without preload, we still
would not get it, since there is no mouse).
Possibly some very partial, incomplete solution would be possible but
frankly that would be an ugly hack.
Heiko

-- 
-- PREVINET S.p.A. www.previnet.it
-- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED]
-- +39-041-5907073 ph
-- +39-041-5907472 fax

> -Original Message-
> From: Fred Holmes [mailto:[EMAIL PROTECTED]
> Sent: Monday, March 15, 2004 3:09 PM
> To: Herold Heiko; 'Raydeen A. Gallogly'; [EMAIL PROTECTED]
> Subject: RE: Wget - relative links within a script call 
> aren't followed
> 
> 
> It surely would be nice if some day WGET could support 
> javascript.  Is that something to put on the "wish list" or 
> is it substantially impossible to implement?  Do folks use 
> Java to load images in order to thwart 'bots such as WGET?
> 
> I run into the same problem regularly, and simply create a 
> series of lines in a batch file that download each of the 
> images by explicit filename.  Very doable, but requires 
> manual setup, rather than having WGET automatically follow 
> the links.  This will test for/download files that are known 
> to ought to be there, but won't find files that are newly added.
> 
> Thanks,
> 
> Fred Holmes
> 
> At 05:07 AM 3/15/2004, Herold Heiko wrote:
> >No way, sorry.
> >wget does not support javascript, so there is no way to have 
> it follow that
> >kind of links.
> >Heiko
> >
> >-- 
> >-- PREVINET S.p.A. www.previnet.it
> >-- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED]
> >-- +39-041-5907073 ph
> >-- +39-041-5907472 fax
> >
> >> -Original Message-
> >> From: Raydeen A. Gallogly [mailto:[EMAIL PROTECTED]
> >> Sent: Friday, March 12, 2004 4:20 PM
> >> To: [EMAIL PROTECTED]
> >> Subject: Wget - relative links within a script call aren't followed
> >> 
> >> 
> >> I'm new to Wget but have learned alot in the last week.  We are
> >> successfully running Wget to mirror a website existing on the 
> >> other side of
> >> a firewall within our own agency.  We can retrieve all 
> >> relative links from
> >> existing HTML files with the exception of those that are 
> >> contained within a
> >> script.
> >> 
> >> For example, this is an excerpt from a script call to load an 
> >> image within
> >> an HTML document that is not being followed:
> >> MM_preloadImages('pix/lats_but_lite.gif',)
> >> 
> >> The only fix to this problem so far that we have been able to 
> >> implement is
> >> to have the webmaster on the site that we want to mirror 
> >> create a small
> >> HTML file named 'wgetfixes.html', link to it from the home 
> >> page using style
> >> (display:none;) so that users won't see.  Within the file, 
> >> list all the
> >> files that they are calling from within their scripts 
> >> individually using
> >> the following syntax:  -- 
> >> this works fine
> >> but I'm hopeful that there is a better way using a switch 
> within Wget.
> >> 
> >> Thanks for any input, it is truly appreciated.  - Raydeen
> >> 
> >> ..
> >> 
> >> 
> >> Raydeen Gallogly
> >> Web Manager
> >> NYS Department of Health, Wadsworth Center
> >> http://www.wadsworth.org
> >> email: [EMAIL PROTECTED]
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
>