from:"Post, Mark K"

Addition to MACHINES File

2001-07-19 Thread Post, Mark K


As requested, I am including the output from ./config-guess for my
Linux for S/390 system.

# ./config.guess
s390-ibm-linux

Version 1.5.3 works just fine on this system, although I am having
problems with 1.6 and 1.7, which I am detailing in a separate email.

Mark Post

Segfault on Linux/390 for wget 1.6 and 1.7

2001-07-19 Thread Post, Mark K


I am having problems with both wget 1.6 and wget 1.7.  I have a working wget
1.5.3 that I use a quite a lot.


When I compile wget 1.6 or 1.7, using either the -O2 (default) or -O1
parameters on gcc 2.95.2, I get segmentation faults as follows:

# wget -m -nd
ftp://ftp.slackware.com/pub/slackware/alpha/slackware-current/slakware/n1/wg
et*
--13:06:27--
ftp://ftp.slackware.com/pub/slackware/alpha/slackware-current/slakware/n1/wg
et*
   = `.listing'
Connecting to ftp.slackware.com:21... connected!
Logging in as anonymous ... Logged in!
== TYPE I ... done.  == CWD
pub/slackware/alpha/slackware-current/slakware/n1 ... done.
== PORT ... done.== LIST ... done.

0K - ...

13:06:39 (417.44 B/s) - `.listing' saved [4025]

Segmentation fault




When I compile wget with -O0 to turn off optimization, wget works, but I get
some garbage in the output as follows:
# wget -m -nd
ftp://ftp.slackware.com/pub/slackware/alpha/slackware-current/slakware/n1/wg
et*
--13:01:08--
ftp://ftp.slackware.com/pub/slackware/alpha/slackware-current/slakware/n1/wg
et*
   = `.listing'
Connecting to ftp.slackware.com:21... connected!
Logging in as anonymous ... Logged in!
== TYPE I ... done.  == CWD
pub/slackware/alpha/slackware-current/slakware/n1 ... done.
== PORT ... done.== LIST ... done.

0K - ...

13:01:11 (3.10 KB/s) - `.listing' saved [4025]

--@woeÿâ8Àt¸EUR@b¸EUR@fOEÿâ8--
   = `'
== CWD not required.
== PORT ... done.== RETR wget-1.6-alpha-1.tgz ... done.
Length: 274,854

0K - .. .. .. .. .. [ 18%]
   50K - .. .. .. .. .. [ 37%]
  100K - .. .. .. .. .. [ 55%]
  150K - .. .. .. .. .. [ 74%]
  200K - .. .. .. .. .. [ 93%]
  250K - .. [100%]

13:01:22 (25.12 KB/s) - `wget-1.6-alpha-1.tgz' saved [274854]


FINISHED --13:01:22--
Downloaded: 278,879 bytes in 2 files




What other documentation would you need from me on this problem?  Please be
specific on how to get it, also, since I am very unfamiliar with gdb, etc.

Mark Post

RE: Segfault on Linux/390 for wget 1.6 and 1.7

2001-10-03 Thread Post, Mark K


Jan,

Did you ever make any progress on this?

Mark Post

-Original Message-
From: Jan Prikryl [mailto:[EMAIL PROTECTED]]
Sent: Thursday, July 19, 2001 1:53 PM
To: Post, Mark K
Cc: Wget mailing list
Subject: Re: Segfault on Linux/390 for wget 1.6 and 1.7


Quoting Post, Mark K ([EMAIL PROTECTED]):

 When I compile wget with -O0 to turn off optimization, wget works, but I
get
 some garbage in the output as follows:

Could you please try 

(1) to run wget with the -d parameter to switch on the debugging
output 

(2) compile wget using -O2 -g and have a look what

  gdb wget core

reports? It shall be able to provide us with the content of the
call stack in the moment of crash that in turn would reveal the
place where wget crashes.

Thanks,

-- jan

---+
  Dr. Jan Prikryl  icq | vr|vis center for virtual reality and
  [EMAIL PROTECTED]  83242638 | visualisation http://www.vrvis.at
---+

RE: wget: ftp through http proxy not working with 1.8.2. It does work with 1.5.3

2003-07-14 Thread Post, Mark K

Hans,

I'm investigating this as a proxy server problem.  When I ran some tests, it
appeared as though the HEAD command from wget was getting translated into
a series of commands to query the size, MDMT of the file, etc., but then I
was seeing a STOR command come from the proxy server, which was getting
failed by the FTP server.  If you'd like to co-ordinate some tests off-list,
we can see if something similar is happening to you.

Do you know what proxy server you are using?


Mark Post

-Original Message-
From: Hans Deragon (QA/LMC) [mailto:[EMAIL PROTECTED]
Sent: Monday, July 14, 2003 10:21 AM
To: '[EMAIL PROTECTED]'
Subject: RE: wget: ftp through http proxy not working with 1.8.2. It doe
s wo rk with 1.5.3


Hi again.


  Some people have reported experiencing the same problem, but nobody from
the development team has forwarded a comment on this.  Anybody can tell us
if this is bug or some config issue?


Regards,
Hans Deragon

 -Original Message-
 From: Hans Deragon (LMC) 
 Sent: Wednesday, July 02, 2003 2:11 PM
 To: '[EMAIL PROTECTED]'
 Subject: wget: ftp through http proxy not working with 1.8.2. 
 It does wo
 rk with 1.5.3
 
 
 Greetings.
 
 
   I read many emails in different archives on the net 
 regarding my isssue, but never found a solution to my 
 problem.  Here is the description:
 
   I am trying to mirror an ftp site though and http proxy.  I 
 have the following setting on both my RH machine running wget 
 1.8.2 and Solaris machine running wget 1.5.3 (actual 
 /etc/wgetrc file is the same on both machine and I do not 
 have ~/.wgetrc on any of the machines):
 
 http_proxy = http://proxy.hostname.com:80/
 ftp_proxy = http://proxy.hostname.com:80/
 use_proxy = on
 
   Ok, proxy.hostname.com is not the real URL I use, but 
 believe me that the one I am using is the right one.  The 
 port numbers are valid and only port 80 is opened, thus both 
 http and ftp request most go through port 80.
 
   Now, when I run say:
 
 wget --mirror -np --cut-dirs=2 ftp://ftp.ox.ac.uk/pub/wordlists/
 
   On my RH machine running wget 1.8.2, only the index.html 
 file is downloaded.  On my Solaris machine running wget 
 1.5.3, all the files under pub/wordlists get downloaded.
 
   Anybody got a clue what the problem is?  Is it that wget 
 1.8.2 is more compliant to a standard and my proxy is bogus 
 and I am just lucky it works with version 1.5.3?  Is there a 
 configuartion parameter missing in my /etc/wgetrc?  Or is it 
 a known problem with version 1.8.2?  I am a newbie with wget, 
 so the mistake is probably on my side, but I cannot figure what it is.
 
 
 Regards,
 Hans Deragon

RE: wget: ftp through http proxy not working with 1.8.2. It does work with 1.5.3

2003-07-14 Thread Post, Mark K

Hans,

Based on what I'm seeing on the FTP server side, and the debugging output
you sent, this definitely looks like a wget problem.  From the FTP server,
everything looks normal.  No strange command sequences, nothing odd at all.

So, I guess you need to keep bugging the wget maintainer (if he's still
interested in working on wget).


Mark

-Original Message-
From: Hans Deragon (QA/LMC) [mailto:[EMAIL PROTECTED]
Sent: Monday, July 14, 2003 1:05 PM
To: 'Post, Mark K'
Subject: RE: wget: ftp through http proxy not working with 1.8.2. It doe
s work with 1.5.3


wget --debug -m ftp://l015062.zseriespenguins.ihost.com/

2nd time output:
==
DEBUG output created by Wget 1.8.2 on linux-gnu.

--12:58:00--  ftp://l015062.zseriespenguins.ihost.com/
   = `l015062.zseriespenguins.ihost.com/index.html'
Resolving www-proxy.lmc.ericsson.se... done.
Caching www-proxy.lmc.ericsson.se = 142.133.17.203
Connecting to www-proxy.lmc.ericsson.se[142.133.17.203]:80... connected.
Created socket 3.
Releasing 0x8082060 (new refcount 1).
---request begin---
HEAD ftp://l015062.zseriespenguins.ihost.com/ HTTP/1.0
User-Agent: Wget/1.8.2
Host: l015062.zseriespenguins.ihost.com
Accept: */*

---request end---
Proxy request sent, awaiting response... HTTP/1.0 200 OK
Server: Squid/2.4.STABLE2
Mime-Version: 1.0
Date: Mon, 14 Jul 2003 16:59:11 GMT
Content-Type: text/html
Age: 25
X-Cache: HIT from www-proxy.lmc.ericsson.se
Proxy-Connection: close


Length: unspecified [text/html]
Closing fd 3
Last-modified header missing -- time-stamps turned off.
--12:58:00--  ftp://l015062.zseriespenguins.ihost.com/
   = `l015062.zseriespenguins.ihost.com/index.html'
Found www-proxy.lmc.ericsson.se in host_name_addresses_map (0x8082060)
Connecting to www-proxy.lmc.ericsson.se[142.133.17.203]:80... connected.
Created socket 3.
Releasing 0x8082060 (new refcount 1).
---request begin---
GET ftp://l015062.zseriespenguins.ihost.com/ HTTP/1.0
User-Agent: Wget/1.8.2
Host: l015062.zseriespenguins.ihost.com
Accept: */*

---request end---
Proxy request sent, awaiting response... HTTP/1.0 200 OK
Server: Squid/2.4.STABLE2
Mime-Version: 1.0
Date: Mon, 14 Jul 2003 16:59:47 GMT
Content-Type: text/html
X-Cache: MISS from www-proxy.lmc.ericsson.se
Proxy-Connection: close


Length: unspecified [text/html]

0K .   163.92 KB/s

Closing fd 3
12:58:11 (163.92 KB/s) - `l015062.zseriespenguins.ihost.com/index.html'
saved [1175]


FINISHED --12:58:11--
Downloaded: 1,175 bytes in 1 files




Diff between 1st and 2nd output:
==
[EMAIL PROTECTED] 2]# diff output  ../1/output 
3c3
 --12:58:00--  ftp://l015062.zseriespenguins.ihost.com/
---
 --12:56:41--  ftp://l015062.zseriespenguins.ihost.com/
11,36d10
 HEAD ftp://l015062.zseriespenguins.ihost.com/ HTTP/1.0
 User-Agent: Wget/1.8.2
 Host: l015062.zseriespenguins.ihost.com
 Accept: */*
 
 ---request end---
 Proxy request sent, awaiting response... HTTP/1.0 200 OK
 Server: Squid/2.4.STABLE2
 Mime-Version: 1.0
 Date: Mon, 14 Jul 2003 16:59:11 GMT
 Content-Type: text/html
 Age: 25
 X-Cache: HIT from www-proxy.lmc.ericsson.se
 Proxy-Connection: close
 
 
 Length: unspecified [text/html]
 Closing fd 3
 Last-modified header missing -- time-stamps turned off.
 --12:58:00--  ftp://l015062.zseriespenguins.ihost.com/
= `l015062.zseriespenguins.ihost.com/index.html'
 Found www-proxy.lmc.ericsson.se in host_name_addresses_map (0x8082060)
 Connecting to www-proxy.lmc.ericsson.se[142.133.17.203]:80... connected.
 Created socket 3.
 Releasing 0x8082060 (new refcount 1).
 ---request begin---
46c20
 Date: Mon, 14 Jul 2003 16:59:47 GMT
---
 Date: Mon, 14 Jul 2003 16:58:28 GMT
54c28
 0K .   163.92 KB/s
---
 0K .   382.49 KB/s
57c31,32
 12:58:11 (163.92 KB/s) - `l015062.zseriespenguins.ihost.com/index.html'
saved [1175]
---
 Last-modified header missing -- time-stamps turned off.
 12:56:52 (382.49 KB/s) - `l015062.zseriespenguins.ihost.com/index.html'
saved [1175]
60c35
 FINISHED --12:58:11--
---
 FINISHED --12:56:52--

RE: wget and procmail

2003-07-29 Thread Post, Mark K

Does the PATH of procmail contain the directory where wget lives?

Mark Post

-Original Message-
From: Michel Lombart [mailto:[EMAIL PROTECTED]
Sent: Tuesday, July 29, 2003 6:51 PM
To: [EMAIL PROTECTED]
Subject: wget and procmail

Hello,

I've an issue with wget and procmail.

I install the forum software mailgust ( http://mailgust.phpoutsourcing.com/
) on a Cobalt/Sun Raq4. I need, in order to use incoming e-mail, to install
a .procmailrc file calling wget.

When I type the complete command on console wget works fine. When wget is
called by procmail it does nothing.

I've enabled a verbose logfile for procmail and I see in the log the call of
wget without error.

Any idea ?

Thank for your help

Michel Lombart

RE: -N option

2003-07-29 Thread Post, Mark K

Other than the --ignore-length option I mentioned previously, no.  Sorry.

Mark Post

-Original Message-
From: Preston [mailto:[EMAIL PROTECTED]
Sent: Tuesday, July 29, 2003 7:01 PM
To: [EMAIL PROTECTED]
Subject: Re: -N option

Aaron S. Hawley wrote:

On Tue, 29 Jul 2003, Post, Mark K wrote:

..
So, perhaps you need to modify your work practices rather than diddle with
the software.  Copy the locally updated files to another location so
they're
not clobbered when the remote version changes.

indeed.  consider creating local copies by instead just tracking
versions of your image files with RCS if its available for your system
(and if you aren't already using it):

http://www.gnu.org/software/rcs/

To answer questons asked so far:  We are using wget version 1.8.2
I have checked the dates on the local file and the remote file and the 
local file date is newer.  The reason I thought it was still clobbering 
despite the newer date on the local was because of the size difference.  
I read that in the online manual here:
 http://www.gnu.org/manual/wget/html_chapter/wget_5.html#SEC22

At the bottom it says,

If the local file does not exist, or the sizes of the files do not 
match, Wget will download the remote file no matter what the time-stamps 
say.

I do want newer files on the remote to replace older files on the local 
server.  Essentially, I want the newest file to remain on the local.  
The problem I am having, however is that if we change/update files on 
the local, if they are of a different size, the remote copy is 
downloaded and clobbers the local no matter what the dates are.  I hope 
this is clear, sorry if I have not explained the problem well.  Let me 
know if you have anymore ideas and if you need me to try again to 
explain.  Thanks for your help.

Preston
[EMAIL PROTECTED]

RE: Wget 1.8.2 timestamping bug

2003-08-06 Thread Post, Mark K

Angelo,

It works for me:
# wget -N http://www.nic.it/index.html
--13:04:39--  http://www.nic.it/index.html
   = `index.html'
Resolving www.nic.it... done.
Connecting to www.nic.it[193.205.245.10]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,474 [text/html]

100%[] 2,474
142.12K/sETA 00:00

13:04:44 (142.12 KB/s) - `index.html' saved [2474/2474]

[EMAIL PROTECTED]:/tmp# wget -N http://www.nic.it/index.html
--13:04:49--  http://www.nic.it/index.html
   = `index.html'
Resolving www.nic.it... done.
Connecting to www.nic.it[193.205.245.10]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,474 [text/html]
Server file no newer than local file `index.html' -- not retrieving.

[EMAIL PROTECTED]:/tmp# wget -V
GNU Wget 1.8.2


Are you perhaps behind a firewall?  At my work location, I frequently run
into cases where the firewall does not correctly pass date and timestamp
information back to wget.


Mark Post

-Original Message-
From: Angelo Archie Amoruso [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 05, 2003 6:36 AM
To: [EMAIL PROTECTED]
Subject: Wget 1.8.2 timestamping bug


Hi All,
I'm using Wget 1.8.2 on a Redhat 9.0 box equipped with 
Athlon 550 MHz cpu, 128 MB Ram.

I've encountered a strange issue, which seem really a bug, using the 
timestamping option.

I'm trying to retrieve the http://www.nic.it/index.html page.
The HEAD HTTP method returns that page is 2474 bytes long
and Last Modified on Wed, 30 Oct 2002.

Using wget I retrieve it (using -N) on /tmp and I get :

(ls -l --time-style=long)

-rw-r--r--1 root root 2474 2002-10-30 15:53 index.html


Then running again wget with -N I get 
The sizes do not match (local 91941) -- retrieving

And on /tmp I get again:
-rw-r--r--1 root root 2474 2002-10-30 15:53 index.html

What's happening? Does Wget check creation time file time, which
is obviously :

 -rw-r--r--1 root root 2474 2003-08-05 12:28 index.html



Thanks for your time and cooperation.
Please reply by email.

Below you'll find actual output

= HEAD ==
Trying 193.205.245.10...
Connected to www.nic.it.
Escape character is '^]'.
GET /index.html HTTP/1.0

HTTP/1.1 200 OK
Date: Tue, 05 Aug 2003 10:11:26 GMT
Server: Apache/2.0.45 (Unix) mod_ssl/2.0.45 OpenSSL/0.9.7a
Last-Modified: Wed, 30 Oct 2002 14:53:58 GMT
ETag: 2bc04-9aa-225d2d80
Accept-Ranges: bytes
Content-Length: 2474
Connection: close
Content-Type: text/html; charset=ISO-8859-1

I run wget with the following parameters:
  wget -N -O /tmp/index.html


== GET OUTPUT 
[EMAIL PROTECTED] celldataweb]# wget -N -O /tmp/index.html 
http://www.nic.it/index.html

--12:18:31--  http://www.nic.it/index.html
   = `/tmp/index.html'
Resolving www.nic.it... done.
Connecting to www.nic.it[193.205.245.10]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,474 [text/html]
The sizes do not match (local 91941) -- retrieving.
--12:18:31--  http://www.nic.it/index.html
   = `/tmp/index.html'
Connecting to www.nic.it[193.205.245.10]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,474 [text/html]

100%[] 2,474 38.35K/sETA 
00:00

12:18:32-rw-r--r--1 root root 2474 Oct 30  2002 index.html 
(38.35 KB/s) - 

On /tmp:

-rw-r--r--1 root root 2474 Oct 30  2002 index.html


When I try again:

[EMAIL PROTECTED] celldataweb]# wget -N -O /tmp/index.html 
http://www.nic.it/index.html


=== SECOND GET OUTPUT ===
--12:18:31--  http://www.nic.it/index.html
   = `/tmp/index.html'
Resolving www.nic.it... done.
Connecting to www.nic.it[193.205.245.10]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,474 [text/html]
The sizes do not match (local 91941) -- retrieving.
--12:18:31--  http://www.nic.it/index.html
   = `/tmp/index.html'
Connecting to www.nic.it[193.205.245.10]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,474 [text/html]

100%[] 2,474 38.35K/sETA 
00:00

12:18:32 (38.35 KB/s) - `/tmp/index.html' saved [2474/2474]

But on /tmp:

-rw-r--r--1 root root 2474 Oct 30  2002 index.html

What is happening?

-- 
To The Kernel And Beyond!

RE: wget is mirroring whole internet instead of just my web page!

2003-08-18 Thread Post, Mark K

man wget shows:
   -D domain-list
   --domains=domain-list
   Set domains to be followed.  domain-list is a comma-separated
list of domains.
   Note that it does not turn on -H.


Mark Post

-Original Message-
From: Andrzej Kasperowicz [mailto:[EMAIL PROTECTED]
Sent: Monday, August 18, 2003 8:38 AM
To: [EMAIL PROTECTED]
Subject: wget is mirroring whole internet instead of just my web page!


When I try to mirror web pages using the command:
wget -m -nv -k -K -nH -t 100 -o logchemfanpl -P public_html/mirror 
http://znik.wbc.lublin.pl/ChemFan/

wget is mirroring not just the domain of the web page but just whole 
internet...

There is robot.txt files, but it should not influence wget to 
download all available domains I suppose?

So why is it happening and how to avoid it?

Regards
Andrzej.

RE: wget is mirroring whole internet instead of just my web page!

2003-08-18 Thread Post, Mark K

It's always been my experience when specifying -m that wget does follow
across domains by default.  I've always had to tell it not to do that.

Mark Post

-Original Message-
From: Andrzej Kasperowicz [mailto:[EMAIL PROTECTED]
Sent: Monday, August 18, 2003 4:02 PM
To: Post, Mark K; [EMAIL PROTECTED]
Subject: RE: wget is mirroring whole internet instead of just my web
page!

On 18 Aug 2003 at 13:49, Post, Mark K wrote:

 man wget shows:
-D domain-list
--domains=domain-list
Set domains to be followed.  domain-list is a comma-separated
 list of domains.
Note that it does not turn on -H.

Right, but by default wget should not follow all domains, 
then why it was happening in this case?

I tried also to mirror another web site from the same server, 
also containing links to other domains:
wget -m -nv -k -K -nH -t 100 -o logmineraly -P public_html/mirror 
http://znik.wbc.lublin.pl/Mineraly/

and in this case it was not downloading from other domains.
So that's a mystery really.

Anyway, if I add -D wbc.lublin.pl it should run correctly?
wget -m -nv -k -K -nH -t 100 -D wbc.lublin.pl -o logchemfanpl -P 
public_html/mirror http://znik.wbc.lublin.pl/ChemFan/

ak

RE: wget and 2 users / passwords to get through?

2003-08-20 Thread Post, Mark K

If this is a non-transparent proxy, you do indeed need to use the proxy
parameters:
--proxy-user=user
--proxy-passwd=password

As well as set the proxy server environment variables
ftp_proxy=http://proxy.server.name[:port]  - Note the http:// value.  That
is correct.
http_proxy=http://proxy.server.name[:port]


Mark Post

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 19, 2003 8:03 AM
To: [EMAIL PROTECTED]
Subject: wget and 2 users / passwords to get through?


Hi

I'm trying to use wget on an external ftp server, but have to pass a gateway
server in the company before i'm on the internet.

So i have to specify 2 sets of user/pass. 1 set for the gateway and 1 for
the
ftp server. This works ok in e.g. windows commander.

But how do i specify this when calling wget?

I have tried several combinations in:

--ftp-user=
--ftp-passwd=

I don't think that the proxy thing shall be involved here.

Best Regards / Venlig Hilsen
Lars Rasmussen

--
Rohde  Schwarz Technology Center A/S
Tel.: +45 96 73 88 88
http://www.rohdeschwarz.dk

Lars Rasmussen
SW Developer
Tel.: +45 96 73 88 34
mailto:[EMAIL PROTECTED]

RE: rfc2732 patch for wget

2003-09-08 Thread Post, Mark K

Absolutely.  I would much rather get an intelligent error message stating
that ipv6 addresses are not supported, versus a misleading one about the
host not being found.  That would save end-users a whole lot of wasted time.


Mark Post

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
Sent: Friday, September 05, 2003 4:23 PM
To: Mauro Tortonesi
Cc: [EMAIL PROTECTED]
Subject: Re: rfc2732 patch for wget


-snip-
I'm starting to think that Wget should reject all [...] addresses
when IPv6 is not compiled in because they, being valid IPv6 addresses,
have no chance of ever working.  What do you think?

RE: wget -r -p -k -l 5 www.protcast.com doesnt pull some images t hough they are part of the HREF

2003-09-09 Thread Post, Mark K

No, it won't.  The javascript stuff makes sure of that.

Mark Post

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 09, 2003 4:32 PM
To: [EMAIL PROTECTED]
Subject: wget -r -p -k -l 5 www.protcast.com doesnt pull some images
though they are part of the HREF

Hi,
I am having some problems with  downloading  www.protcast.com.
I used  wget -r -p -k -l 5 www.protcast.com 
In www.protcast.com/Grafx files  menu-contact_(off).jpg get downloaded.
However menu-contact_(on).jpg does not get downloaded though it lies in the
same directory as the 
menu-contact_(off).jpg file.

index.html contains the following HREF 

A HREF=contact.htm
ONMOUSEOVER=msover1('m-contact','Grafx/menu-contact_(on).jpg');
ONMOUSEOUT=msout1('m-contact','Grafx/menu-contact_(off).jpg');
IMG SRC=Grafx/menu-contact_(off).jpg NAME=m-contact WIDTH=197
HEIGHT=29 BORDER=0/A

so wget should be able to see this image right?.

Please help/advice. 
bye

RE: Small change to print SSL version

2003-09-17 Thread Post, Mark K

Perhaps, but it is kind of nice to get that information from the program
itself at the same time you get the version information.  For example:
# ssh -V
OpenSSH_3.7p1, SSH protocols 1.5/2.0, OpenSSL 0.9.7b 10 Apr 2003

All the information, from one place.


Mark Post

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 17, 2003 7:15 AM
To: Christopher G. Lewis
Cc: [EMAIL PROTECTED]
Subject: Re: Small change to print SSL version


Christopher G. Lewis [EMAIL PROTECTED] writes:

 Here's a small change to print out the OpenSSL version with the -V 
 --help parameters.
[...]

I think that GNU Wget something should always stand for Wget's
version, regardless of the libraries it has been compiled with.  But
if you want to see the version of libraries, why not make it clearer,
e.g.:

GNU Wget x.x.x (compiled with OpenSSL x.x.x)

BTW can't you find out OpenSSL version by using `ldd'?

RE: Compile and link problems with wget 1.9 beta5

2003-10-12 Thread Post, Mark K

Do you see the missing symbol when you do an nm -D command against either
libssl.so or libcrypto.so?  (It shows up on my Linux system in
libcrypto.so.)


Mark Post

-Original Message-
From: Robert Poole [mailto:[EMAIL PROTECTED]
Sent: Sunday, October 12, 2003 2:23 PM
To: [EMAIL PROTECTED]
Subject: Compile and link problems with wget 1.9 beta5


After ploughing through the archives of this mailing list, looking for 
additional clues why wget 1.8.2 wasn't linking correctly, I found that 
wget 1.9 beta 5 was released recently.  I downloaded the source code 
for wget 1.9 beta 5 and am getting the same link problems I was getting 
with 1.8.2:

/bin/sh ../libtool --mode=link gcc -O2 -Wall -Wno-implicit  -o wget  
cmpt.o connect.o convert.o cookies.o ftp.o ftp-basic.o ftp-ls.o 
ftp-opie.o getopt.o hash.o headers.o host.o html-parse.o html-url.o 
http.o init.o log.o main.o gen-md5.o netrc.o progress.o rbuf.o recur.o 
res.o retr.o safe-ctype.o snprintf.o gen_sslfunc.o url.o utils.o 
version.o -lssl -lcrypto
mkdir .libs
gcc -O2 -Wall -Wno-implicit -o wget cmpt.o connect.o convert.o 
cookies.o ftp.o ftp-basic.o ftp-ls.o ftp-opie.o getopt.o hash.o 
headers.o host.o html-parse.o html-url.o http.o init.o log.o main.o 
gen-md5.o netrc.o progress.o rbuf.o recur.o res.o retr.o safe-ctype.o 
snprintf.o gen_sslfunc.o url.o utils.o version.o  -lssl -lcrypto
ld: Undefined symbols:
_OPENSSL_add_all_algorithms_noconf
make[1]: *** [wget] Error 1
make: *** [src] Error 2

I've tried to determine if my OpenSSL installation was built wrong, but 
as far as I can determine, it's OK.  That doesn't mean that there's 
nothing wrong with OpenSSL on this platform, but so far, this link 
error has been the only problem I've encountered.

The platform is a dual-processor G5 running Mac OS X 10.2.8 with the 
latest developer tools installed (gcc 3.3 with G5 optimizer settings 
available, although I haven't used any of the command line switches to 
turn those optimizations on).

Help?

Best Regards,
Rob Poole
[EMAIL PROTECTED]

RE: feature request: --second-guess-the-dns

2003-11-15 Thread Post, Mark K

You can do this now:

wget http://216.46.192.85/

Using DNS is just a convenience after all, not a requirement.


Mark Post

-Original Message-
From: Dan Jacobson [mailto:[EMAIL PROTECTED]
Sent: Saturday, November 15, 2003 4:00 PM
To: [EMAIL PROTECTED]
Subject: feature request: --second-guess-the-dns


I see there is
   --bind-address=ADDRESS
   When making client TCP/IP connections, bind() to ADDRESS on the
local machine.
   ADDRESS may be specified as a hostname or IP address.  This
option can be useful
   if your machine is bound to multiple IPs.

But I want a
   --second-guess-the-dns=ADDRESS
so I can
$ wget http://jidanni.org/
Resolving jidanni.org... done.
Connecting to jidanni.org[216.46.203.182]:80... connected.
HTTP request sent, awaiting response... 503 Service Unavailable
$ wget --second-guess-the-dns=216.46.192.85 http://jidanni.org/
Connecting to jidanni.org[216.46.192.85]:80... connected...

Even allow different port numbers there, even though we can add them
after the url already:

$ wget --second-guess-the-dns=216.46.192.85:66 http://jidanni.org:888/
or whatever. Also pick a better name than --second-guess-the-dns --
which is just a first guess for a name.

Perhaps the user should do all this in the name server or something,
but lets say he isn't root, and doesn't want to use netcat etc. either.

RE: how to get mirror just a portion of a website ?

2003-11-16 Thread Post, Mark K

Use the -np or --no-parent option.

Mark Post

-Original Message-
From: Josh Brooks [mailto:[EMAIL PROTECTED]
Sent: Sunday, November 16, 2003 11:48 PM
To: [EMAIL PROTECTED]
Subject: how to get mirror just a portion of a website ?

Generally, I mirror an entire web site with:

wget --tries=inf -nH --no-parent --random-wait -r -l inf --convert-links
--html-extension www.example.com

But, that is if I am mirroring an _entire_ web site - where the URL looks
like;

www.example.com

BUT, how can I mirror a URL that looks like:

http://www.example.com/~user/dir/

and get everything starting with ~user/dir/ and everything underneath it,
but nothing above it - for instance, if there was a link back to
~user/otherdir/ I would not want to get that.

So basically, I want to mirror ~user/dir/ and below, and follow nothing
else - how can I do that ?

thank.

RE: problem with LF/CR etc.

2003-11-19 Thread Post, Mark K

That is _really_ ugly, and perhaps immoral.  Make it an option, if you must.
Certainly don't make it the default behavior.

Shudder


Mark Post

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 19, 2003 4:59 PM
To: Peter GILMAN
Cc: [EMAIL PROTECTED]
Subject: Re: problem with LF/CR etc.


Peter GILMAN [EMAIL PROTECTED] writes:

 i have run into a problem while using wget: when viewing a web page with
 html like this:

a href=images/IMG_01
.jpgimg src=images/tnIMG_01
.jpg/a

Eek!  Are people really doing that?  This is news to me.

 browsers (i tested with mozilla and IE) can handle the line breaks
 in the urls (presumably stripping them out), but wget chokes on the
 linefeeds and carriage returns; it inserts them into the urls, and
 then (naturally) fails with a 404:
[...]

So, Wget should squash all newlines?  It's not hard to implement, but
it feels kind of ... unclean.

RE: SSL over proxy passthrough

2003-11-28 Thread Post, Mark K

I tested the Windows binary against the only SSL-enabled web server outside
our firewall that I could think of at the moment, and it worked for me.

Mark Post

-Original Message-
From: Herold Heiko [mailto:[EMAIL PROTECTED]
Sent: Friday, November 28, 2003 3:18 AM
To: [EMAIL PROTECTED]
Cc: List Wget (E-mail)
Subject: RE: SSL over proxy passthrough


For who wants to test that from windows, MSVC binary at
http://xoomer.virgilio.it/hherold/
Heiko

-- 
-- PREVINET S.p.A. www.previnet.it
-- Heiko Herold [EMAIL PROTECTED]
-- +39-041-5907073 ph
-- +39-041-5907472 fax

 -Original Message-
 From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
 Sent: Friday, November 28, 2003 3:26 AM
 To: [EMAIL PROTECTED]
 Subject: SSL over proxy passthrough
 
 
 This patch implements a first attempt of using the CONNECT method to
 establish passthrough of SSL communication over non-SSL proxies.  This
 will require testing.
 
 2003-11-28  Hrvoje Niksic  [EMAIL PROTECTED]
 
   * http.c (gethttp): Use the CONNECT handle to establish SSL
   passthrough through non-SSL proxies.
 
 Index: src/http.c
 ===
 RCS file: /pack/anoncvs/wget/src/http.c,v
 retrieving revision 1.125
 diff -u -r1.125 http.c
 --- src/http.c2003/11/27 23:29:36 1.125
 +++ src/http.c2003/11/28 02:22:00
 @@ -804,7 +804,7 @@
authenticate_h = NULL;
auth_tried_already = 0;
  
 -  inhibit_keep_alive = !opt.http_keep_alive || proxy != NULL;
 +  inhibit_keep_alive = !opt.http_keep_alive;
  
   again:
/* We need to come back here when the initial attempt to retrieve
 @@ -825,21 +825,72 @@
hs-remote_time = NULL;
hs-error = NULL;
  
 -  /* If we're using a proxy, we will be connecting to the proxy
 - server. */
 -  conn = proxy ? proxy : u;
 +  conn = u;
  
 +  proxyauth = NULL;
 +  if (proxy)
 +{
 +  char *proxy_user, *proxy_passwd;
 +  /* For normal username and password, URL components override
 +  command-line/wgetrc parameters.  With proxy
 +  authentication, it's the reverse, because proxy URLs are
 +  normally the permanent ones, so command-line args
 +  should take precedence.  */
 +  if (opt.proxy_user  opt.proxy_passwd)
 + {
 +   proxy_user = opt.proxy_user;
 +   proxy_passwd = opt.proxy_passwd;
 + }
 +  else
 + {
 +   proxy_user = proxy-user;
 +   proxy_passwd = proxy-passwd;
 + }
 +  /*  This does not appear right.  Can't the proxy request,
 +  say, `Digest' authentication?  */
 +  if (proxy_user  proxy_passwd)
 + proxyauth = basic_authentication_encode (proxy_user, 
 proxy_passwd,
 +  Proxy-Authorization);
 +
 +  /* If we're using a proxy, we will be connecting to the proxy
 +  server.  */
 +  conn = proxy;
 +}
 +
host_lookup_failed = 0;
 +  sock = -1;
  
/* First: establish the connection.  */
 -  if (inhibit_keep_alive
 -  || !persistent_available_p (conn-host, conn-port,
 +
 +  if (!inhibit_keep_alive)
 +{
 +  /* Look for a persistent connection to target host, unless a
 +  proxy is used.  The exception is when SSL is in use, in which
 +  case the proxy is nothing but a passthrough to the target
 +  host, registered as a connection to the latter.  */
 +  struct url *relevant = conn;
  #ifdef HAVE_SSL
 -   u-scheme == SCHEME_HTTPS
 +  if (u-scheme == SCHEME_HTTPS)
 + relevant = u;
 +#endif
 +
 +  if (persistent_available_p (relevant-host, relevant-port,
 +#ifdef HAVE_SSL
 +   relevant-scheme == SCHEME_HTTPS,
  #else
 -   0
 +   0,
  #endif
 -   , host_lookup_failed))
 +   host_lookup_failed))
 + {
 +   sock = pconn.socket;
 +   using_ssl = pconn.ssl;
 +   logprintf (LOG_VERBOSE, _(Reusing existing 
 connection to %s:%d.\n),
 +  pconn.host, pconn.port);
 +   DEBUGP ((Reusing fd %d.\n, sock));
 + }
 +}
 +
 +  if (sock  0)
  {
/* In its current implementation, persistent_available_p will
look up conn-host in some cases.  If that lookup failed, we
 @@ -855,28 +906,75 @@
   ? CONERROR : CONIMPOSSIBLE);
  
  #ifdef HAVE_SSL
 - if (conn-scheme == SCHEME_HTTPS)
 -   {
 -  if (!ssl_connect (sock))
 -{
 -  logputs (LOG_VERBOSE, \n);
 -  logprintf (LOG_NOTQUIET,
 - _(Unable to establish SSL connection.\n));
 -  fd_close (sock);
 -  return CONSSLERR;
 -}
 -  using_ssl = 1;
 -   }
 +  if (proxy  u-scheme == SCHEME_HTTPS)
 + {
 +   /* When requesting SSL URLs through proxies, use the
 +  CONNECT method to request passthrough.  */
 +   char *connect =
 + (char *) alloca (64
 +  +

RE: wget can't get the following site

2004-01-09 Thread Post, Mark K

Because the URL has special characters in it, surround it in double quotes:
wget
http://quicktake.morningstar.com/Stock/Income10.asp?Country=USASymbol=JNJ;
stocktab=finance


Mark Post

-Original Message-
From: David C. [mailto:[EMAIL PROTECTED]
Sent: Friday, January 09, 2004 2:01 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: wget can't get the following site


Hi, all
 
Please CC me when you reply.  I'm not subscribed to this list.
 
I'm new to wget.  When I tried getting the following using wget, 
 
wget
http://quicktake.morningstar.com/Stock/Income10.asp?Country=USASymbol=JNJs
tocktab=finance
 
I got the errors below:
 
--22:58:29--
http://quicktake.morningstar.com:80/Stock/Income10.asp?Country=USA
   = [EMAIL PROTECTED]'
Connecting to quicktake.morningstar.com:80... connected!
HTTP request sent, awaiting response... 302 Object moved
Location: http://quote.morningstar.com/switch.html?ticker= [following]
--22:58:30--  http://quote.morningstar.com:80/switch.html?ticker=
   = [EMAIL PROTECTED]'
Connecting to quote.morningstar.com:80... connected!
HTTP request sent, awaiting response... 302 Object moved
Location: TickerNotFound.html [following]
TickerNotFound.html: Unknown/unsupported protocol.
'Symbol' is not recognized as an internal or external command,
operable program or batch file.
'stocktab' is not recognized as an internal or external command,
operable program or batch file.
 
Is this a bug in wget?  Or is there something I can do so that wget can get
the site?
 
Please help!  Thanks in advance.
 


-
Do you Yahoo!?
Yahoo! Hotjobs: Enter the Signing Bonus Sweepstakes

RE: wget -- ftp with proxy

2004-01-13 Thread Post, Mark K

Yes, it should be.

Mark Post

-Original Message-
From: Cui, Byron [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 13, 2004 11:57 AM
To: [EMAIL PROTECTED]
Subject: wget -- ftp with proxy

Hi,

If use ftp through proxy, would the passive-ftp option still be valid?

Thanks. 

Byron Cui

e-Commerce Infrastructure Support and Information Security 
IBG Production Support
Phone: 416-867-6822
Fax: 416-867-7157

FONT SIZE =
1**
**
This e-mail and any attachments may contain confidential and privileged
information. If you are not the intended recipient, please notify the sender
immediately by return e-mail, delete this e-mail and destroy any copies. Any
dissemination or use of this information by a person other than the intended
recipient is unauthorized and may be illegal. Unless otherwise stated,
opinions expressed in this e-mail are those of the author and are not
endorsed by the author's employer./FONT

RE: Syntax question ...

2004-01-21 Thread Post, Mark K

Well, that's what you're telling it to do with the -S option, so why are you
surprised?  man wget, then /-S

Mark Post

-Original Message-
From: Simons, Rick [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 21, 2004 11:09 AM
To: '[EMAIL PROTECTED]'
Subject: RE: Syntax question ...

I got wget compiled with ssl support now, and have a followup question ...
I'm getting the local file created but populated with a server response, not
the actual contents of the remote file.  See example:

wget -d -S https://server/testfile --http-user=user --http-passwd=pass
DEBUG output created by Wget 1.9.1 on linux-gnu.

--10:55:06--  https://server/testfile
   = `testfile'
Resolving server... ip
Caching server = ip
Connecting to server[ip]:443... connected.
Created socket 3.
Releasing 0x81229f0 (new refcount 1).
---request begin---
GET /testfile HTTP/1.0
User-Agent: Wget/1.9.1
Host: server
Accept: */*
Connection: Keep-Alive
Authorization: Basic cmlja3M6cmlja3MyNjI2

---request end---
HTTP request sent, awaiting response... HTTP/1.1 200 OK
Date: Wed, 21 Jan 2004 16:04:01 GMT

 2 Date: Wed, 21 Jan 2004 16:04:01 GMTServer: Apache/1.3.26 (Unix)
mod_ssl/2.8.10 OpenSSL/
0.9.6g SecureTransport/4.1.2

 3 Server: Apache/1.3.26 (Unix) mod_ssl/2.8.10 OpenSSL/0.9.6g
SecureTransport/4.1.2Set-Coo
kie: FDX=ocjoMt028Um+ri2vZQ0L6g==; path=/

 4 Set-Cookie: FDX=ocjoMt028Um+ri2vZQ0L6g==; path=/
Stored cookie filed2 443 / nonpermanent 0 undefined FDX
ocjoMt028Um+ri2vZQ0L6g==
Accept-Ranges: bytes

 5 Accept-Ranges: bytesExpires: Thu, 01 Jan 1970 00:00:00 GMT

 6 Expires: Thu, 01 Jan 1970 00:00:00 GMTFeatures: CHPWD;RTCK;STCK;ASC

 7 Features: CHPWD;RTCK;STCK;ASCConnection: close

 8 Connection: closeContent-Type: text/plain; charset=UTF-8

 9 Content-Type: text/plain; charset=UTF-8

[ =   ] 30--.--K/s

Closing fd 3
10:55:07 (292.97 KB/s) - `testfile' saved [30]

cat testfile
Virtual user username logged in.

ssl access log:
ip - user [21/Jan/2004:10:04:02 -0600] GET /testfile HTTP/1.0 200 30

ssl error log:
[Wed Jan 21 10:04:01 2004] [info] VIRTUAL HTTP LOGIN FROM ip [ip], user
(class virt)

Further thoughts or suggestions?

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 21, 2004 9:41 AM
To: Simons, Rick
Cc: '[EMAIL PROTECTED]'
Subject: Re: Syntax question ...

Simons, Rick [EMAIL PROTECTED] writes:

 Greetings all.

 I've posted in the past, but never really have gotten connectivity to a
 https server I support using the wget application.  I've looked in the
 manual, on the website and searched the Internet but am not getting very
 far.

 wget -V
   GNU Wget 1.9

 wget -d -S https://server/file
   https://server/file: Unsupported scheme.

This error message indicates that your version of Wget is compiled
without SSL support.

 I then decided (based on previous instruction from this group) to
recompile
 wget with ssl.  This is on a RH9 box, with openssl libs in
 /usr/include/openssl

 ./configure --with-ssl=/usr/include/openssl/
   compiles
   Looking for SSL libraries in /usr/include/openssl/
   checking for includes... not found
   ERROR: Failed to find OpenSSL libraries.

Try just `./configure', it should find the SSL libraries in the
default location.  At least it does for me -- I use RH9.

RE: problem with # in path

2004-01-22 Thread Post, Mark K

It's more likely your system/shell that is doing it, if you're using Linux
or UNIX.

wget -r -l 0 ftp://19.24.24.24/some/datase/C\#Tool/


Mark Post

-Original Message-
From: Peter Mikeska [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 22, 2004 6:28 PM
To: [EMAIL PROTECTED]
Subject: problem with # in path


Hi,
im trying get all from
wget -r -l 0 ftp://19.24.24.24/some/datase/C#Tool/
vut i cant get anything, because wget cut all from #, it thinks its
comment.
plz any help
Thnx in advance
Miki

  



+---V---+ |   Peter Mikeska  |[EMAIL PROTECTED] |
| A L C A T E L | |  System Engineer |  phone:   +421 44 5206316 |
+---+ | IT Services MadaCom  |  fax: +421 44 5206356 |

RE: GNU Wget 1.9.1

2004-05-12 Thread Post, Mark K

Title: Message



It's a known 
bug. I'm waiting for a fix for it myself.


Mark 
Post


-Original Message-From: Lawrance, Mark 
[mailto:[EMAIL PROTECTED] Sent: Wednesday, May 12, 
2004 9:09 AMTo: [EMAIL PROTECTED]Subject: GNU Wget 
1.9.1

GNU Wget 1.9.1
The non-interactive download utility
Updated for Wget 1.9.1, May 2003
I am unable to get wget to 
work via a proxy for HTTPS sites.

It does work via proxy for 
HTTP
It does work with HTTPS NOT 
through proxy

Any ideas? Should this 
work?

Mark Lawrance

Senior Wintel ArchitectArchitecture 
and EngineeringLondon Stock ExchangeDDE: +44 (0)20 7797 
1277
Mobile: +44 (0)7971 032235[EMAIL PROTECTED]
---The 
London Stock Exchange plc will shortly be moving to: 10 Paternoster 
Square London EC4M 7LSThis will be our new registered office 
address from 17th May 2004. The move will be completed by mid June 
2004.Our telephone numbers, email and website addresses will not be 
changing. Please speak to your company contact if you have any questions 
about the 
move.---Please 
read these warnings and restrictions:This e-mail transmission is 
strictly confidential and intended solely for the ordinary user of the e-mail 
address to which it was addressed. It may contain legally privileged and/or 
CONFIDENTIAL information. The unauthorised use, disclosure, distribution 
and/or copying of this e-mail or any information it contains is prohibited and 
could, in certain circumstances, constitute a criminal offence.If you 
have received this e-mail in error or are not an intended recipient please 
inform the London Stock Exchange immediately by return e-mail or telephone 020 
7797 1000.We advise that in keeping with good computing practice the 
recipient of this e-mail should ensure that it is virus free. We do not accept 
responsibility for any virus that may be transferred by way of this e-mail. 
E-mail may be susceptible to data corruption, interception and 
unauthorised amendment, and we do not accept liability for any such corruption, 
interception or amendment or any consequences thereof. Calls to the 
London Stock Exchange may be recorded to enable the Exchange to carry out its 
regulatory responsibilities.London Stock Exchange plcOld Broad 
StreetLondon EC2N 1HPRegistered in England and Wales No 
20757

RE: Preserving file ownership

2004-05-01 Thread Post, Mark K

I don't believe so.  You might want to take a look at rsync instead.  It
does a very nice job of doing just what you need.

Mark Post

-Original Message-
From: Kathryn Moretz [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 29, 2004 4:40 PM
To: [EMAIL PROTECTED]
Cc: Kathryn Moretz
Subject: Preserving file ownership

I am using wget to mirror multiple directories between 2 servers via FTP.
This mirroring process will be running as root in the background
continuously.  The directories / files are owned by different users and
groups.  Is there a way to preserve this ownership when files are
transferred from the remote host to the local host?  As it currently stands,
the mirrors are owned by root:system instead of by the original owner.

Thank you in advance.  Please cc me in any replies to this post, as I do not
currently subscribe to this list.

RE: wget hangs or downloads end up incomplete in Windows 2000 X P.

2004-05-20 Thread Post, Mark K

Are you behind a firewall or proxy of some kind?  If so, you might want to
try using passive FTP mode.

Mark Post

-Original Message-
From: Phillip Pi [mailto:[EMAIL PROTECTED] 
Sent: Thursday, May 20, 2004 3:08 PM
To: [EMAIL PROTECTED]
Subject: RE: wget hangs or downloads end up incomplete in Windows 2000  X
P.

FYI. I noticed if I ctrl-c to get out of the hanging part and try to 
resume, my FTP seems to be broken and hangs. I tried manually with ftp.exe 
command in command line and it froze with dir command:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\Documents and Settings\phillip_piftp 192.168.14.18 Connected to
192.168.14.18. 220 USSM-CPD Microsoft FTP Service (Version 5.0). User
(192.168.14.18:(none)): domain\username 331 Password required for
domain\username.
Password:
230 User domain\username logged in.
ftp dir
200 PORT command successful.
150 Opening ASCII mode data connection for /bin/ls.

[stuck forever until I ctrl-c to break out of it]

I either have to reboot the computer OR wait maybe ten minutes to try 
again with the FTP connection.
-- 
  This is the ant. Treat it with respect. For it may very well be the
 next dominant lifeform of our planet. --Empire of the Ants movie
  /\___/\
 / /\ /\ \   Phillip Pi (Ant) @ The Ant Farm: http://antfarm.ma.cx
| |o   o| |   E-mail: [EMAIL PROTECTED] or [EMAIL PROTECTED]
   \ _ /Be sure you removed ANT from e-mail address if you get
( ) a returned e-mail.

On Wed, 12 May 2004, Phillip Pi wrote:

  On Wed, 12 May 2004, Herold Heiko wrote:

OK, I did more tests. I noticed -v is already enabled by
default since the

   you probably have verbose=on in your wgetrc file.

  Good idea. Should I delete wgetrc? I doubt that will fix my problem 
  since
  I tried on two different Windows machines.

 FYI. I don't have wgetrc file anywhere. I only have sample.wgetrc file 
 on the machines I used so it sounds like verbose is enabled by default 
 regardless of wgetrc file.

 5250K .. .. .. .. ..

The timestamp was from almost an hour ago (I was in a
meeting) during the 
download test. Notice it never timed out to retry or abort! Please 

   What happens if you restart wget again with mirror-like options on 
   the same directory tree ? Does it hang again on the same file ? If 
   yes, what if you

  Which wget parameter options are they? I never noticed it hangs on 
  the same file for each hang in different tests. Please remember 
  sometimes I have missing files when downloads are complete. It is 
  either hang, finish but incomplete, or perfect. Those are the three 
  results I have seen from many tests.

   try to download that file only ?
   If not, could you for any chance run a sniffer on that machine 
   (ethereal is
   free) ?

  I do not know how to use this network tool. If you can give me
  instructions I can try!

   It would be useful to know if really everything is freezed, or if, 
   for example, for some reason the data is just trickling down at 
   1byte/minute or something similar (stuck in retrasmission?).

  I have no idea. Does wget have a bytes/bits per second statistics?

RE: Escaping semicolons (actually Ampersands)

2004-06-29 Thread Post, Mark K

Then you haven't looked at enough web sites.  Whenever tidydbg (from w3.org)
tells me to do that in one of my URLs, I do that.  I've got one page of
links that has tons of them.  They work.  Can we stop arguing about this
off-topic bit now?


Mark Post

-Original Message-
From: Tony Lewis [mailto:[EMAIL PROTECTED] 
Sent: Monday, June 28, 2004 10:17 PM
To: Phil Endecott; [EMAIL PROTECTED]
Subject: Re: Escaping semicolons (actually Ampersands)


Phil Endecott wrote:


 Tony The stuff between the quotes following HREF is not HTML; it is a 
 Tony URL. Hence, it must follow URL rules not HTML rules.

 No, it's both a URL and HTML.  It must follow both rules.

 Please see the page that I cited in my previous message: 
 http://www.htmlhelp.com/tools/validator/problems.html#amp

I've looked at hundreds of web pages and I've never seen anyone put amp;
into HREF in  place of an ampersand.

Tony

RE: Metric units

2004-12-23 Thread Post, Mark K

Yeah, you're both right.  While we're at it, why don't we just round off the
value of pi to be 3.0.  Those pesky trailing decimals are just an accident
of history anyway.

-Original Message-
From: Carlos Villegas [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 23, 2004 8:22 PM
To: Tony Lewis
Cc: wget@sunsite.dk
Subject: Re: Metric units

On Thu, Dec 23, 2004 at 12:57:18PM -0800, Tony Lewis wrote:
 John J Foerch wrote:

  It seems that the system of using the metric prefixes for numbers 
  2^n is a simple accident of history.  Any thoughts on this?

 I would say that the practice of using powers of 10 for K and M is a 
 response to people who cannot think in binary.

I would say that the original poster understands what he is saying, and you
clearly don't...

http://physics.nist.gov/cuu/Units/binary.html

kilo, mega, giga, tera and many others are standard in SI and widely used in
physics, chemistry, engineering by their real meaning (powers of 10). The
hole powers of 2 thing is just because 1024 is close to 1000 and computers
work in binary, so is logical to think in powers of 2 (so yes, a mere
accident of 20th century history).

Carlos

RE: Metric units

2004-12-23 Thread Post, Mark K

No, but that particular bit of idiocy was the inspiration for my comment.  I
just took it one decimal point further.

-Original Message-
From: Tony Lewis [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 24, 2004 2:22 AM
To: wget@sunsite.dk
Subject: RE: Metric units

Mark Post wrote: 

 While we're at it, why don't we just round off the value of pi to be 
 3.0

Do you live in Indiana?

Actually, Dr. Edwin Goodwin wanted to round off pi to any of several values
including 3.2.

http://www.agecon.purdue.edu/crd/Localgov/Second%20Level%20pages/Indiana_Pi_
Story.htm

Tony

RE: selective recursive downloading

2005-01-21 Thread Post, Mark K

wget -m -np http://url.to.download/something/group-a/want-to-download/ \
http://url.to.download/something/group-b/want-to-download/ \
http://url.to.download/something/group-c/want-to-download/


Mark Post

-Original Message-
From: Gabor Istvan [mailto:[EMAIL PROTECTED] 
Sent: Friday, January 21, 2005 9:16 AM
To: wget@sunsite.dk
Subject: selective recursive downloading


Dear All:

I would like to know how could I use wget to selectively download 
certain subdirectories of a main directory. Here is what I want to do:

Let's assume that we have a directory structure like this:

http://url.to.download/something/group-a/want-to-download/
http://url.to.download/something/group-a/not-to-download/
http://url.to.download/something/group-b/want-to-download/
http://url.to.download/something/group-b/not-to-download/
http://url.to.download/something/group-c/want-to-download/
http://url.to.download/something/group-c/not-to-download/

I would like to download all of the files from the want-to-download 
subdirectories of groups a, b and c. I dont want to download anything 
from the not-to-download subdirectories of groups a, b and c.

There are a lot of groups so it would be very painful to use -I or -X as 
these options - as I know - require a full path definition which would be 
different for each wanted and not-wanted directory because of the 
different groups.

My question is how could I automate the downloads, what should I 
write in the command line?

Thanks for your answer.
Please send a copy to my email address ([EMAIL PROTECTED]) since 
I am not subscribe to the wget list.

IG

RE: 403 Forbidden Errors with mac.com

2005-02-08 Thread Post, Mark K

Title: RE: 403 Forbidden Errors with mac.com





Don't know what is happening on your end. I just executed
wget http://idisk.mac.com/tombb/Public/tex-edit-plus-X.sit


and it downloaded 2,484,062 bytes of something.


What does using the -d option show you?



Mark Post


-Original Message-
From: Emily Jackson [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, February 08, 2005 6:26 AM
To: wget@sunsite.dk
Subject: 403 Forbidden Errors with mac.com



This produces a 403 Forbidden error:


wget http://idisk.mac.com/tombb/Public/tex-edit-plus-X.sit


as does this:


wget --user-agent=Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/125.5.6 (KHTML, like Gecko) Safari/125.12 http://idisk.mac.com/tombb/Public/tex-edit-plus-X.sit

(all on one line, of course; the user agent specified is for Apple's Safari browser)


curl -O works fine, however. What else could I try to be able to use wget to download this file? [wget 1.9.1, Mac OS X 10.3.7] (Please cc any replies directly to me.)

Thanks,


Emily


-- 
If it seem slow, wait for it; it will surely come, it will not delay. Emily Jackson http://home.hiwaay.net/~emilyj/missjackson.html

RE: bug-wget still useful

2005-03-15 Thread Post, Mark K

I don't know why you say that.  I see bug reports and discussion of fixes
flowing through here on a fairly regular basis.


Mark Post


-Original Message-
From: Dan Jacobson [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, March 15, 2005 3:04 PM
To: [EMAIL PROTECTED]
Subject: bug-wget still useful


Is it still useful to mail to [EMAIL PROTECTED] I don't think anybody's
home.  Shall the address be closed?

RE: links conversion; non-existent index.html

2005-05-01 Thread Post, Mark K

Probably because you're the only one that thinks it is a problem, instead of 
the way it needs to function?  Nah, that couldn't be it.

Mark Post

-Original Message-
From: Andrzej Kasperowicz [mailto:[EMAIL PROTECTED] 
Sent: Sunday, May 01, 2005 2:54 PM
To: Jens Rösner; wget@sunsite.dk
Subject: Re: links conversion; non-existent index.html

-snip-
 You expect??

Yes, of course. Why are you so surprised?

a.

RE: Switching to subversion for version control

2005-05-12 Thread Post, Mark K

You might want to give Ibiblio a try (www.ibiblio.org).  They host my
Slack/390 web/FTP site at no cost.  They host a _bunch_ of sites at no
cost.


Mark Post

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED] 
Sent: Thursday, May 12, 2005 5:24 AM
To: wget@sunsite.dk
Subject: Switching to subversion for version control


-snip-
I'm also interested in information about free svn hosting.
sunsite/dotsrc and savannah.gnu.org currently don't seem to be offering
subversion hosting.  There is www.berlios.de, but I have no experience
with them.

RE: Switching to subversion for version control

2005-05-12 Thread Post, Mark K

I really don't know, but they seem very accommodating to people,
especially Open Source projects such as wget.  It's certainly worth an
email to find out.  Send your request to help at ibiblio.org.


Mark Post

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED] 
Sent: Thursday, May 12, 2005 3:46 PM
To: Post, Mark K
Cc: wget@sunsite.dk
Subject: Re: Switching to subversion for version control


Post, Mark K [EMAIL PROTECTED] writes:

 You might want to give Ibiblio a try (www.ibiblio.org).  They host my 
 Slack/390 web/FTP site at no cost.  They host a _bunch_ of sites at no

 cost.

But do they host subversion?  I can't find any mention of it with
google.

RE: No more Libtool (long)

2005-06-25 Thread Post, Mark K

I read the entire message, but I probably didn't have to.  My experience
with libtool in packages that really are building libraries has been
pretty painful.  Since wget doesn't build any, getting rid of it is one
less thing to kill my builds in the future.  Congratulations.


Mark Post

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 24, 2005 8:11 PM
To: wget@sunsite.dk
Subject: No more Libtool (long)


Thanks to the effort of Mauro Tortonesi and the prior work of Bruno
Haible, Wget has been modified to no longer use Libtool for linking in
external libraries.  If you are interested in why that might be a cause
for celebration, read on.

RE: No more Libtool (long)

2005-06-27 Thread Post, Mark K

This is the kind of obnoxious commentary I've learned to expect from
glibc's maintainers.  It's no more becoming from you (or anyone else).
Buzz off.


Mark Post

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Maciej W. Rozycki
Sent: Monday, June 27, 2005 8:01 AM
To: Hrvoje Niksic
Cc: wget@sunsite.dk
Subject: Re: No more Libtool (long)

-snip-
Everyone else please either file bug reports (or better yet fix bugs you

trip over) or keep silent.

RE: No more Libtool (long)

2005-06-27 Thread Post, Mark K

You already blew that opportunity when you told us to shut up.  Blame
yourself.


Mark Post

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Maciej W. Rozycki
Sent: Monday, June 27, 2005 11:15 AM
To: Post, Mark K
Cc: wget@sunsite.dk
Subject: RE: No more Libtool (long)


-snip-
 Let's focus on technical issues rather than making it personal, OK?

  Maciej

RE: robots.txt takes precedence over -p

2005-08-08 Thread Post, Mark K

I hope that doesn't happen.  While respecting robots.txt is not an
absolute requirement, it is considered polite.  I would not want the
default behavior of wget to be considered impolite.


Mark Post

-Original Message-
From: Mauro Tortonesi [mailto:[EMAIL PROTECTED] 
Sent: Monday, August 08, 2005 7:43 PM
To: Tony Lewis
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: robots.txt takes precedence over -p


On Sunday 10 July 2005 09:52 am, Tony Lewis wrote:
 Thomas Boerner wrote:
  Is this behaviour:  robots.txt takes precedence over -p a bug or a

  feature?

 It is a feature. If you want to ignore robots.txt, use this command 
 line:

 wget -p -k www.heise.de/index.html -e robots=off

hrvoje was thinking of changing the default behavior of wget to ignore
the 
robots standard in the next releases.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute for Human  Machine Cognition  http://www.ihmc.us
GNU Wget - HTTP/FTP file retrieval tool
http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it

RE: robots.txt takes precedence over -p

2005-08-08 Thread Post, Mark K

I would say the analogy is closer to a very rabid person operating a web
browser.  I've never been greatly inconvenienced by having to re-run a
download while ignoring the robots.txt file.  As I said, respecting
robots.txt is not a requirement, but it is polite.  I prefer my tools to
be polite unless I tell them otherwise.

Mark Post

-Original Message-
From: Mauro Tortonesi [mailto:[EMAIL PROTECTED] 
Sent: Monday, August 08, 2005 8:35 PM
To: Post, Mark K
Cc: [EMAIL PROTECTED]
Subject: Re: robots.txt takes precedence over -p


On Monday 08 August 2005 07:30 pm, Post, Mark K wrote:
 I hope that doesn't happen.  While respecting robots.txt is not an 
 absolute requirement, it is considered polite.  I would not want the 
 default behavior of wget to be considered impolite.

IMVHO, hrvoje has a good point when he says that wget behaves like a web

browser and, as such, should not required to respect the robots
standard.

RE: wget displays permission error

2005-09-01 Thread Post, Mark K

In the past, I have been confused as to whether the file which was
generating the error was on the server, or on my local system.  If there
is a way to distinguish between the two, and be more explicit, that
would be a little more helpful.

I don't see any way wget could/should do anything except report the
error.


Mark Post

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 01, 2005 6:16 AM
Cc: Kentaro Ozawa; [EMAIL PROTECTED]
Subject: Re: wget displays permission error


Jochen Roderburg [EMAIL PROTECTED] writes:

 Hmm, this did not actually try to write over 'index.html', did it  ;-)

 Do the same with 'timestamping on' and you get (not surprisingly and 
 with 'all' wget versions I have around) :

 index.html: Permission denied
 Cannot write to `index.html' (Permission denied).

But what is Wget to do in such a case except report an error?

RE: retr.c:292: calc_rate: Assertion `bytes = 0' failed.

2005-11-24 Thread Post, Mark K

Odd.  It didn't take me long to find this:
http://ftp.us.debian.org/debian/pool/main/w/wget/wget_1.10.2-1_i386.deb

Mark Post

-Original Message-
From: Simeon Miteff [mailto:[EMAIL PROTECTED] 
Sent: Thursday, November 24, 2005 2:10 AM
To: [EMAIL PROTECTED]
Subject: retr.c:292: calc_rate: Assertion `bytes = 0' failed.

Hi

I don't know if this is a known bug (I could not get any useful results 
out of the bugzilla), but if it isn't, the server shown in 
this example is public, so the problem should be re-producable.

I realise that 1.10.2 is the latest version, but Debian doesn't seem 
to think so :-)

RE: retr.c:292: calc_rate: Assertion `bytes = 0' failed.

2005-11-24 Thread Post, Mark K

Not really.  Debian will let you install whatever you want, provided the
dependencies are satisfied.  If you set up your apt parms properly, you
can download and install packages from stable, testing, unstable, etc.
If you don't want to do that for everything, you can set them back to
just pick up new package versions from the stable channel.


Mark Post

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED] 
Sent: Thursday, November 24, 2005 4:43 PM
To: Post, Mark K
Cc: Simeon Miteff; [EMAIL PROTECTED]
Subject: Re: retr.c:292: calc_rate: Assertion `bytes = 0' failed.


Post, Mark K [EMAIL PROTECTED] writes:

 Odd.  It didn't take me long to find this: 
 http://ftp.us.debian.org/debian/pool/main/w/wget/wget_1.10.2-1_i386.de
 b

It's questionnable whether that's installable on stable Debian.

RE: Limit time to run

2005-11-30 Thread Post, Mark K

I think that a combination of --limit-rate and --wait parameters makes
this type of enhancement unnecessary, given that his stated purpose was
to not hammer a particular site.


Mark Post

-Original Message-
From: Mauro Tortonesi [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 30, 2005 12:02 PM
To: Frank McCown
Cc: wget@sunsite.dk
Subject: Re: Limit time to run


Frank McCown wrote:
 It would be great if wget had a way of limiting the amount of time it
 took to run so it won't accidentally hammer on someone's web server
for 
 an indefinate amount of time.  I'm often needing to let a crawler run 
 for a while on an unknown site, and I have to manually kill wget after
a 
 few hours if it hasn't finished yet.  It would be nice if I could do:
 
 wget --limit-time=120 ...
 
 to make it stop itself after 120 minutes.
 
 Please cc me on any replies.

i don't think we need to add this feature to wget, as it can be achieved

with a shell script that launches wget in background, sleeps for the 
given amount of time and then kills the wget process.

however, if there is a general consensus about adding this feature to 
wget, i might consider changing my mind.

RE: wget - tracking urls/web crawling

2006-06-22 Thread Post, Mark K

Try using the -np (no parent) parameter.

Mark Post 

-Original Message-
From: bruce [mailto:[EMAIL PROTECTED] 
Sent: Thursday, June 22, 2006 4:15 PM
To: 'Frank McCown'; wget@sunsite.dk
Subject: RE: wget - tracking urls/web crawling

hi frank...

there must be something simple i'm missing...

i'm looking to crawl the site 
http://timetable.doit.wisc.edu/cgi-bin/TTW3.search.cgi?20071

i issue the wget:
 wget -r -np
http://timetable.doit.wisc.edu/cgi-bin/TTW3.search.cgi?20071

i thought that this would simply get everything under the
http://...?20071.
however, it appears that wget is getting 20062, etc.. which are the
other
semesters...

what i'd really like to do is to simply get 'all depts' for each of the
semesters...

any thoughts/comments/etc...

-bruce

Excluding directories

2006-06-26 Thread Post, Mark K

I'm trying to download parts of the SUSE Linux 10.1 tree.  I'm going
after things below http://suse.mirrors.tds.net/pub/suse/update/10.1/,
but I want to exclude several directories in
http://suse.mirrors.tds.net/pub/suse/update/10.1/rpm/

In that directory are the following subdirectories:
i586/
i686/
noarch/
ppc/
ppc64/
src/
x86_64/

I only want the i586, i686, and noarch directories.  I tried using the
-X parameter, but it only seems to work if I specify  -X
/pub/suse/update/10.1/rpm/ppc,/pub/suse/update/10.1/rpm/ppc64,/pub/suse/
update/10.1/rpm/src,/pub/suse/update/10.1/rpm/x86_64

Is this the only way it's supposed to work?  I was hoping to get away
with something along the lines of -X rpm/ppc,rpm/src or -X ppc,src and
so on.


Thanks,

Mark Post

RE: wget 403 forbidden error when no index.html.

2006-07-07 Thread Post, Mark K

The short answer is that you don't get to do it.  If your browser can't
do it, wget isn't going to be able to do it.

Mark Post 

-Original Message-
From: news [mailto:[EMAIL PROTECTED] On Behalf Of Aditya Joshi
Sent: Friday, July 07, 2006 12:15 PM
To: wget@sunsite.dk
Subject: wget 403 forbidden error when no index.html.

I am trying to download a specific directory contents of a site and i
kep
getting the 403 forbidden when i run wget. The direcotry does not have
an
index.html and ofcourse any refrences to that path result a 403 page
displayed
in my browser. Is this why wget is not working. If so how to download
contents
of such sites.

RE: Wget

2006-07-13 Thread Post, Mark K




You would want to use the -O option, and write a script to create a 
unique file name to be passed to wget.


Mark Post

  
  
  From: John McGill [mailto:[EMAIL PROTECTED] 
  Sent: Thursday, July 13, 2006 4:56 AMTo: 
  wget@sunsite.dkSubject: Wget
  
  
  Hi,
  
  
  I hope you can help with a small 
  problem I am having with the above win32 application. I wish to download a 
  jpeg image from a camera, at the same time every day for the duration of a 
  project. I need to be able to give the downloaded file either a unique file 
  identifier or a time stamp so that I can compile a sequence at the end of the 
  project. Is there a way of telling wget to download the image and increment 
  the file number or add the date/time stamp?
  
  I am sure you are a very busy 
  person and I hope you will have the time to answer my rather basic 
  question.
  
  
  Regards
  
  John 
  McGill

48 matches

Mail list logo