Re: wget for mirroring/updating sites.

2004-01-13 Thread Daniel Daboul
On Tue, Jan 13, 2004 at 02:01:47PM +0100, [EMAIL PROTECTED] wrote:
> we are trying to upgrade a newer version of wget (1.5.1 -> 1.9.1), but
> for some reason updating/mirroring of a ftp site is failing with the
> new version.

The following part from your example output ...

> Resolving proxy1.abcdef.com... 10.105.193.55
> Caching proxy1.abcdef.com => 10.105.193.55
> Connecting to proxy1.abcdef.com[10.105.193.55]:8080... connected.

... indicates that you're trying the recursive down-load over a
proxy. That's broken in all version after 1.7.1. As reference I
include below a recent exchange on that topic (from this list). I'm
using wget1.7.1 happily since then, but still think the problem could
be mentioned in the documentation of the current version. - Daniel

On Tue, Dec 09, 2003 at 11:30:54PM +0100, Hrvoje Niksic wrote:
> You're not making a mistake, recursive download over FTP proxies is
> currently broken.

On Wed, Dec 24, 2003 at 04:01:46PM +0200, Daniel Daboul wrote:
> That is probably the single feature I'd want most. Is there an older
> version of wget, where it works (didn't find it in the ChangeLog)?
> 
> Or is it expected to be fixed anytime soon? - Daniel

On Wed, Dec 24, 2003 at 09:53:38PM +0100, Jochen Roderburg wrote:
> New bugs are rarely documented in ChangeLogs, unless they are implemented
> deliberately ;-)
> 
> This one appeared in v1.8, older versions work as expected.
> 
> Best regards, J.Roderburg



wget fails using the proxy with https-protocol

2004-01-13 Thread Juergen Schliessmann
Hi,

Wget 1.9.1 fails if a http-proxy and the secure https protocol is 
used.

Packet sniffing shows that Wget does not initiate a ssl connection to
the proxy but instead connects directly to the target host (obvious by 
a DNS-query) then after that omitting the proxy. Thus if the proxy is 
part of a secured lan with a firewall it is impossible to do a data 
fetch by https via the http-proxy.


The wget command line taht was used is ...

wget 
https://www.stanford.edu/group/idg/leland/samples/secure/test.html


The logfile shows ...

--23:27:34--  
https://www.stanford.edu/group/idg/leland/samples/secure/test.html
   => `test.html'
Resolving www.stanford.edu... 171.67.16.85
Connecting to www.stanford.edu[171.67.16.85]:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 159 [text/html]

0K   100%  
636.37 KB/s

23:27:41 (636.37 KB/s) - `test.html' saved [159/159]


The ~/.wgetrc content is ...

waitretry = 10
http_proxy = http://81.137.90.137:8080/
use_proxy = on
dirstruct = off
recursive = off
backup_converted = off
logfile = ./download.log


A packet sniffing (ethereal-capture) shows ...

SourceDestination   Protocol Info
---
SpeedPoint.localnet   T-DSL.localnetDNS  Standard query A 
www.stanford.edu

T-DSL.localnetSpeedPoint.localnet   DNS  Standard query 
response A 171.67.16.85

SpeedPoint.localnet   www.stanford.edu  TCP  32843 > https 
[SYN] Seq=1383116802 Ack=0 Win=5840 Len=0
---
www.stanford.edu  SpeedPoint.localnet   TCP  https > 32843 
[SYN, ACK] Seq=1301652952 Ack=1383116803 Win=1460 Len=0
---
SpeedPoint.localnet   www.stanford.edu  TCP  32843 > https 
[ACK] Seq=1383116803 Ack=1301652953 Win=5840 Len=0
---
SpeedPoint.localnet   www.stanford.edu  SSLv2Client Hello
---
www.stanford.edu  SpeedPoint.localnet   TCP  https > 32843 
[ACK] Seq=1301652953 Ack=1383116927 Win=25092 Len=0
---
www.stanford.edu  SpeedPoint.localnet   TLS  Server Hello, 
Certificate, Server Key Exchange, Server Hello Done
---
SpeedPoint.localnet   www.stanford.edu  TCP  32843 > https 
[ACK] Seq=1383116927 Ack=1301654312 Win=8154 Len=0
---
SpeedPoint.localnet   www.stanford.edu  TLS  Client Key 
Exchange, Change Cipher Spec, Encrypted Handshake Message
---
www.stanford.edu  SpeedPoint.localnet   TCP  https > 32843 
[ACK] Seq=1301654312 Ack=1383117117 Win=25092 Len=0
---
www.stanford.edu  SpeedPoint.localnet   TLS  Change Cipher 
Spec, Encrypted Handshake Message
---
SpeedPoint.localnet   www.stanford.edu  TCP  32843 > https 
[ACK] Seq=1383117117 Ack=1301654363 Win=8154 Len=0
---
SpeedPoint.localnet   www.stanford.edu  TLS  Application Data, 
Application Data
---
www.stanford.edu  SpeedPoint.localnet   TCP  https > 32843 
[ACK] Seq=1301654363 Ack=1383117319 Win=25092 Len=0
---
www.stanford.edu  SpeedPoint.localnet   TLS  Application Data, 
Application Data
---
SpeedPoint.localnet   www.stanford.edu  TCP  32843 > https 
[FIN, ACK] Seq=1383117319 Ack=1301654821 Win=10872 Len=0
---
www.stanford.edu  SpeedPoint.localnet   TCP  https > 32843 
[ACK] Seq=1301654821 Ack=1383117320 Win=25092 Len=0
---
www.stanford.edu  SpeedPoint.localnet   TCP  https > 32843 
[FIN, ACK] Seq=1301654821 Ack=1383117320 Win=25092 Len=0
---
SpeedPoint.localnet   www.stanford.edu  TCP  32843 > https 
[ACK] Seq=1383117320 Ack=1301654822 Win=10872 Len=0



Hint: The latest wget development release (1.9+cvs-dev)
also shows the problem.


Bye,

Juergen Schliessmann



RE: wget -- ftp with proxy

2004-01-13 Thread Post, Mark K
Yes, it should be.


Mark Post

-Original Message-
From: Cui, Byron [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 13, 2004 11:57 AM
To: [EMAIL PROTECTED]
Subject: wget -- ftp with proxy


Hi,

If use ftp through proxy, would the passive-ftp option still be valid?

Thanks. 

Byron Cui

e-Commerce Infrastructure Support and Information Security 
IBG Production Support
Phone: 416-867-6822
Fax: 416-867-7157




**
**
This e-mail and any attachments may contain confidential and privileged
information. If you are not the intended recipient, please notify the sender
immediately by return e-mail, delete this e-mail and destroy any copies. Any
dissemination or use of this information by a person other than the intended
recipient is unauthorized and may be illegal. Unless otherwise stated,
opinions expressed in this e-mail are those of the author and are not
endorsed by the author's employer.


wget -- ftp with proxy

2004-01-13 Thread Cui, Byron
Hi,

If use ftp through proxy, would the passive-ftp option still be valid?

Thanks. 

Byron Cui

e-Commerce Infrastructure Support and Information Security 
IBG Production Support
Phone: 416-867-6822
Fax: 416-867-7157





This e-mail and any attachments may contain confidential and privileged information. 
If you are not the intended recipient, please notify the sender immediately by return 
e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this 
information by a person other than the intended recipient is unauthorized and may be 
illegal. Unless otherwise stated, opinions expressed in this e-mail are those of the 
author and are not endorsed by the author's employer.



wget for mirroring/updating sites.

2004-01-13 Thread Rainer . Scherg
Hi,

we are trying to upgrade a newer version of wget (1.5.1 -> 1.9.1), but
for some reason updating/mirroring of a ftp site is failing with the
new version.

We tried to tweak alomst any wget option to get it running again.

What what are we doing, what happens:

 We are trying to do a daily (incremental mirror) of a ftp site using
 a chain of proxies.

 Whatever we do, the new wget is just downloading the "index.html"
 for the directory and then is stopping. Wget does not try to check,
 if there are changes in a subtree (or in the current directory).

 This worked for version 1.5.1 fine, but 1.5.1 has (IMO) a timestamp
 bug: Downloading a file, which was changed within a short timeperiod
 seems to fail, because wget seems to ignore timezone differences.
 Maybe this happens, if wget doesn't get timestamps for a file.


So I would like to propose to options:
   - --resume-on-errors=count(count = to prevent endless loops)
   - some more incremental mirror handling options


Please see the wget output for the problem.


Is there any way to solve this problem using wget?


Tnx 

  Rainer



Example:
- snip 

 (antivirus-test)# /var/ftpd/sbin/wget-1.9.1 -v -S -c  -m--cache=off
-np   -P /var/ftpd/pub/antivirus-test/ --follow-ftp  -d
ftp://ftp.nai.com/pub/antivirus/datfiles/4.x/
DEBUG output created by Wget 1.9.1 on solaris2.8.

--13:48:03--  ftp://ftp.nai.com/pub/antivirus/datfiles/4.x/
   =>
`/var/ftpd/pub/antivirus-test/ftp.nai.com/pub/antivirus/datfiles/4.x/index.h
tml'
Resolving proxy1.abcdef.com... 10.105.193.55
Caching proxy1.abcdef.com => 10.105.193.55
Connecting to proxy1.abcdef.com[10.105.193.55]:8080... connected.
Created socket 4.
Releasing 56ef0 (new refcount 1).
---request begin---
HEAD ftp://ftp.nai.com/pub/antivirus/datfiles/4.x/ HTTP/1.0
User-Agent: Wget/1.9.1
Host: ftp.nai.com
Accept: */*
Range: bytes=11373-
Pragma: no-cache

---request end---
Proxy request sent, awaiting response... HTTP/1.0 200 OK
Server: Squid/2.4.STABLE2

 2 Server: Squid/2.4.STABLE2Mime-Version: 1.0

 3 Mime-Version: 1.0Date: Tue, 13 Jan 2004 12:34:43 GMT

 4 Date: Tue, 13 Jan 2004 12:34:43 GMTContent-Type: text/html

 5 Content-Type: text/htmlX-Cache: MISS from 

 6 X-Cache: MISS from X X-Cache: MISS from
XX

 7 X-Cache: MISS from XXXProxy-Connection: close

 8 Proxy-Connection: close


Continued download failed on this file, which conflicts with `-c'.
Refusing to truncate existing file
`/var/ftpd/pub/antivirus-test/ftp.nai.com/pub/antivirus/datfiles/4.x/index.h
tml'.

Closing fd 4

FINISHED --13:48:05--
Downloaded: 0 bytes in 0 files



- snap -




---
Rainer Scherg
BRI/TDV6 - Internet,IntraNet,eMail
Industrial Hydraulics
Bosch Rexroth AG
"The Drive & Control Company"
97813, Lohr am Main
Phone: +49-(0)9352-18-1510
Fax:   +49-(0)9352-18-1500
www.boschrexroth.de





AW: wget for mirroring/updating sites.

2004-01-13 Thread Rainer . Scherg
Hi,

This mail is related to my< prior post...
here is a clipboard copy of an wget-1.9.1 output:


ftp.nai.com/pub/antivirus/datfiles/4.x/index.html: Invalid URL http://ns1.boschrexroth.de:3128/squid-internal-static/icons/anthony-unk
nown.gif" ALT="[FILE]"> 42984299.upd . . . .
. . . . . . 10-21-03 10:26PM180k http://ns1.boschrexroth.de:3128/squid-internal-static/icons/anthony-tex
t.gif" ALT="[VIEW]"> http://ns1.boschrexroth.de:3128/squid-internal-static/icons/anthony-box
.gif" ALT="[DOWNLOAD]">: Unsupported scheme


Q is: Why is a unsupported scheme reported?
wget fails to get/mirror the file 


cu
  Rainer



> -Ursprüngliche Nachricht-
> Von: Scherg, Rainer (BRI/TDV6) * 
> Gesendet: Dienstag, 13. Januar 2004 14:02
> An: '[EMAIL PROTECTED]'
> Betreff: wget for mirroring/updating sites.
> 
> 
> Hi,
> 
> we are trying to upgrade a newer version of wget (1.5.1 -> 1.9.1), but
> for some reason updating/mirroring of a ftp site is failing with the
> new version.
> 
[...]
>