wget bug?

2001-06-13 Thread Story, Ian

> Hello,
> I have been a very happy user of wget for a long time.  However, today I
> noticed that some sites, that don't run on port 80, don't work well with
> wget.  For instance, when I tell wget to go get http://www.yahoo.com, it
> automatically puts :80 at the end, like this: http://www.yahoo.com:80.
> That is fine, most of the time, but some sites won't like that, and in
> fact, will give a 404 error, or other errors.  So, I consulted the
> documentation, but couldn't find a way around this...is there a
> fix/workaround/something in the manual that I didn't see or understand to
> get around this?  I tried a few web searches, and didn't find much
> information...
> Thanks VERY much,
> 
> Ian Story
> Knowledge Management Practice
> (253) 924-4689
> [EMAIL PROTECTED]
> 



wget bug (?)

2001-11-14 Thread Bernard, Shawn
Title: wget bug (?)





I'm not sure if this is a bug or not, but when I ran this line:
    wget -r -l2 http://www.turnerclassicmovies.com/NowPlaying/Index
I get this result:


===


13:11:02 (5.62 MB/s) - `www.turnerclassicmovies.com/NowPlaying/Index' saved [41230]


Loading robots.txt; please ignore errors.
--13:11:02--  http://www.turnerclassicmovies.com/robots.txt
   => `www.turnerclassicmovies.com/robots.txt'
Connecting to www.turnerclassicmovies.com:80... connected!
HTTP request sent, awaiting response... 404 Not found
13:11:02 ERROR 404: Not found.


--13:11:02--  http://www.turnerclassicmovies.com/TCM_css/1,3607,,00.css
   => `www.turnerclassicmovies.com/TCM_css/1,3607,,00.css'
Connecting to www.turnerclassicmovies.com:80... connected!
HTTP request sent, awaiting response... 200 OK
Length: 2,123 [text/css]


    0K ..    100% @   2.02 MB/s


13:11:03 (2.02 MB/s) - `www.turnerclassicmovies.com/TCM_css/1,3607,,00.css' saved [2123/2123]


--13:11:03--  http://www.turnerclassicmovies.com/TCM/Images/spacer.gif
   => `www.turnerclassicmovies.com/TCM/Images/spacer.gif'
Reusing connection to www.turnerclassicmovies.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 43 [image/gif]


    0K   100% @  41.99 KB/s


13:11:03 (41.99 KB/s) - `www.turnerclassicmovies.com/TCM/Images/spacer.gif' saved [43/43]


--13:11:03--  http://www.turnerclassicmovies.com/Home/Index/0,3436,,00.html
   => `www.turnerclassicmovies.com/Home/Index/0,3436,,00.html'
Reusing connection to www.turnerclassicmovies.com:80.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]


    0K .. .. ..   @   5.18 MB/s


13:11:03 (5.18 MB/s) - `www.turnerclassicmovies.com/Home/Index/0,3436,,00.html' saved [27179]


Segmentation Fault(coredump)


==


As you see, there is a Segmentation Fault at the end before it completed.  I couldn't find anything about it on the gnu.org site or the docs or the Internet.  I thought that you'd like to know.

Shawn





wget bug

2001-10-08 Thread Dmitry . Karpov

Dear sir.

When I out to my browser (NN'3) line
http://find.infoart.ru/cgi-bin/yhs.pl?hidden=http%3A%2F%2F194.67.26.82&word=FreeBSD
wget working correctly.

When I put this line to wget, wget change this line;
argument hidden is "http:/194.67.26.82&word",
argument word is empty. Where I am wrong?



wget bug

2001-10-10 Thread Muthu Swamy


HI,
When I try to send a page to Nextel mobile using the following command from unix box, 
"wget http://www.nextel.com/cgi-bin/sendPage.cgi?to01=4157160856%26message=hellothere%26action=send"
The wget returns the following message but the page is not reaching the phone.
"--15:59:16-- http://www.nextel.com:80/cgi-bin/sendPage.cgi?to01=4157160856&mess
age=hellothere&action=send
=> `sendPage.cgi?to01=4157160856&message=hellothere&action=send'
Location: http://messaging.nextel.com/cgi/mPageExt.dll?buildIndAddressPage&entry
=1 [following]
--15:59:16-- http://messaging.nextel.com:80/cgi/mPageExt.dll?buildIndAddressPag
e&entry=1
=> `mPageExt.dll?buildIndAddressPage&entry=1.14'
Length: unspecified [text/html]
0K -> .
15:59:16 (75.02 KB/s) - `mPageExt.dll?buildIndAddressPage&entry=1.14' saved [998
6]
But when I send page from Nextel.com web site, it reaches my cell phone.
I thought you would help me out.
Highly would be appreciated your valuable help
Thanks,
MuthuGet your FREE download of MSN Explorer at http://explorer.msn.com


wget bug?!

2002-02-18 Thread TD - Sales International Holland B.V.

Hey there,

I wanna download a file at mustek's ftp site in america. This site has a 20 
users limit. Have a look at this:

bash-2.05# wget --wait=30 --waitretry=30 -t 0 
ftp://128.121.112.104/pub/1200UBXP/Web.EXE
--15:10:37--  ftp://128.121.112.104/pub/1200UBXP/Web.EXE
   => `Web.EXE'
Connecting to 128.121.112.104:21... connected!
Logging in as anonymous ...
The server refuses login.
Retrying.

--15:10:39--  ftp://128.121.112.104/pub/1200UBXP/Web.EXE
  (try: 2) => `Web.EXE'
Connecting to 128.121.112.104:21... connected!
Logging in as anonymous ...
The server refuses login.
Retrying.

I've tried -w 30
--waitretry=30
--wait=30 (I think this one is for multiple files and the time in between 
those though)

None of these seem to make wget wanna wait for 30 secs before trying again. 
Like this I'm hammering the server.

Please feel free to smack me if I overlooked anything, I used wget --help to 
find out these options. Wget version is bash-2.05# wget --version
GNU Wget 1.7

Copyright (C) 1995, 1996, 1997, 1998, 2000, 2001 Free Software Foundation, 
Inc.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Originally written by Hrvoje Niksic <[EMAIL PROTECTED]>.

distributed with slackware 8.0 by Patrick Volkerding

regards,

Ferry van Steen




WGET BUG

2002-07-07 Thread Kempston



    Hi, i have a problem and would 
really like you to help me. i`m using wget for downloading list of file 
urls via http proxy. When proxy server goes 
offline - wget doesn`t retry downloading of files. Can you fix that or can you 
tell me how can i fix that ?


WGET BUG

2002-07-07 Thread Kempston



Like That
 
Connecting to 195.108.41.140:3128... failed: 
Connection 
refused.    
--01:19:23--  ftp://kempston:*password*@194.151.106.227:15003/Dragon   
=> 
`dragon.001'   
Connecting to 195.108.41.140:3128... failed: Connection 
refused.    
--01:19:23--  ftp://kempston:*password*@194.151.106.227:15003/Dragon   
=> 
`dragon.002   
Connecting to 195.108.41.140:3128... failed: Connection 
refused.    
--01:19:23--  ftp://kempston:*password*@194.151.106.227:15003/Dragon   
=> 
`dragon.003   
Connecting to 195.108.41.140:3128... failed: Connection 
refused.    
--01:19:23--  ftp://kempston:*password*@194.151.106.227:15003/Dragon   
=> 
`dragon.004   
Connecting to 195.108.41.140:3128... failed: Connection 
refused.    
    
FINISHED 
--01:19:23--   
Downloaded: 150,000,000 bytes in 10 files
 
- Original Message - 

  From: 
  Kempston 
  To: [EMAIL PROTECTED] 
  Sent: Monday, July 08, 2002 12:50 
AM
  Subject: WGET BUG
  
      Hi, i have a problem and would 
  really like you to help me. i`m using wget for downloading list of file 
  urls via http proxy. When proxy server 
  goes offline - wget doesn`t retry downloading of files. Can you fix that or 
  can you tell me how can i fix that ?


wget bug

2002-11-05 Thread Jing Ping Ye


Dear Sir:
I tried to use "wget" download data from ftp site but got error message
as following:
> wget  ftp://ftp.ngdc.noaa.gov/pub/incoming/RGON/anc_1m.OCT
Screen show:
--
--09:02:40--  ftp://ftp.ngdc.noaa.gov/pub/incoming/RGON/anc_1m.OCT
   => `anc_1m.OCT'
Resolving ftp.ngdc.noaa.gov... done.
Connecting to ftp.ngdc.noaa.gov[140.172.180.164]:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD /pub/incoming/RGON ... done.
==> PORT ... done.    ==> RETR anc_1m.OCT ...
Error in server response, closing control connection.
Retrying.
---
But when I use ftp  ( ftp ftp.ngdc.noaa.gov), I can get data.
My computer is linux system version : 2.4.18-10smp #smp i686 unknow
wget verion : GNU wget 1.8.1
I have a script file use "wget" to get  data files automatic every
month,. when my computer was linux version (6.2), "wget" did work well.
Since I update linux version (7.4), "wget" didn't work as above .
Thank you for your help.
 
-- 
==
Jing Ping Ye    Email: [EMAIL PROTECTED]
    Phone: 303 497 3713
National Geophysical Data Center
CIRES, University of Colorado, Boulder, CO 80309
==
 


wget bug?

2003-02-22 Thread Marian Förster
hallo!

i use wget to transfer www pages :-)
but i found following bug:
if there is a directory  f.I.  .../anwendungen/CLIC
wget tranfer the structure to  .../anwendungen/clic,
but the links in the www pages stay incorrect, that means the link 
href="anwendungen/CLIC" is after transfer false or show to an non exist 
directory

can you help me?



wget bug

2003-09-26 Thread Jack Pavlovsky
It's probably a bug:
bug: when downloading 
wget -mirror ftp://somehost.org/somepath/3acv14~anivcd.mpg, 
 wget saves it as-is, but when downloading
wget ftp://somehost.org/somepath/3*, wget saves the files as 3acv14%7Eanivcd.mpg

--
The human knowledge belongs to the world


Wget Bug

2003-11-10 Thread Kempston
Here is debug output

:/FTPD# wget ftp://ftp.dcn-asu.ru/pub/windows/update/winxp/xpsp2-1224.exe -d
DEBUG output created by Wget 1.8.1 on linux-gnu.

--13:25:55--  ftp://ftp.dcn-asu.ru/pub/windows/update/winxp/xpsp2-1224.exe
   => `xpsp2-1224.exe'
Resolving ftp.dcn-asu.ru... done.
Caching ftp.dcn-asu.ru => 212.192.20.40
Connecting to ftp.dcn-asu.ru[212.192.20.40]:21... connected.
Created socket 3.
Releasing 0x8073398 (new refcount 1).
Logging in as anonymous ... 220 news FTP server ready.

--> USER anonymous
331 Guest login ok, send your complete e-mail address as password.
--> PASS -wget@
530 Login incorrect.

Login incorrect.
Closing fd 3

Server reply is 

<--- 530-
<--- 530-Sorry! Too many users are logged in.
<--- 530-Try letter, please.
<--- 530-
<--- 530 Login incorrect.
 Server reply matched ftp:retry-530, retrying

But wget won`t even try to retry :(
Can you fix that ?

wget bug

2004-01-06 Thread Kairos
$ cat wget.exe.stackdump
Exception: STATUS_ACCESS_VIOLATION at eip=77F51BAA
eax= ebx= ecx=0700 edx=610CFE18 esi=610CFE08 edi=
ebp=0022F7C0 esp=0022F74C program=C:\nonspc\cygwin\bin\wget.exe
cs=001B ds=0023 es=0023 fs=0038 gs= ss=0023
Stack trace:
Frame Function  Args
0022F7C0  77F51BAA  (000CFE08, 6107C8F1, 610CFE08, )
0022FBA8  77F7561D  (1004D9C0, , 0022FC18, 00423EF8)
0022FBB8  00424ED9  (1004D9C0, 0022FBF0, 0001, 0022FBF0)
0022FC18  00423EF8  (1004A340, 002A, 7865646E, 6D74682E)
0022FD38  0041583B  (1004A340, 0022FD7C, 0022FD80, 100662C8)
0022FD98  00420D93  (10066318, 0022FDEC, 0022FDF0, 100662C8)
0022FE18  0041EB7D  (10021A80, 0041E460, 610CFE40, 0041C2F4)
0022FEF0  0041C47B  (0004, 61600B64, 10020330, 0022FF24)
0022FF40  61005018  (610CFEE0, FFFE, 07E4, 610CFE04)
0022FF90  610052ED  (, , 0001, )
0022FFB0  00426D41  (0041B7D0, 037F0009, 0022FFF0, 77E814C7)
0022FFC0  0040103C  (0001, 001D, 7FFDF000, F6213CF0)
0022FFF0  77E814C7  (00401000, , 78746341, 0020)
End of stack trace


wget bug

2005-01-15 Thread Matthew F. Dennis
It seems that wget uses a signed 32 bit value for the content-length in HTTP.  I
haven't looked at the code, but it appears that this is what is happening. 

The problem is that when a file larger than about 2GB is downloaded, wget
reports negative numbers for it's size and quits the download right after it
starts.

I would assume that somewhere there is a loop that looks something like:

while( "what I've downloaded" < "what I think the size is" )
{
//do some more downloading.
}

And after the first read from the stream, the loop fails because whatever you
read is indeed bigger than a negative number so it exits.

Of course, this is all speculation on my part about what the code looks like but
none the less, the bug does exist on both linux and cygwin.

Thanks,

Matt

---
BTW:
great job, really...  
on wget and all the GNU software in general...
THANKS



Wget bug

2005-02-02 Thread Vitor Almeida



OS = Solaris 
8
Platform = 
Sparc
 
Test command = 
/usr/local/bin/wget -r -t0 -m ftp://root:[EMAIL PROTECTED]/usr/openv/var
The directory will 
count to some sub-direcotry's  and files to 
synchronize.
 
Example 
:
 
# ls -la 
/usr/openv/total 68462drwxr-xr-x  14 root 
bin  512 set  1 17:52 
.drwxr-xr-x  18 root 
sys  512 dez 16 17:01 
..drwxr-xr-x   2 root 
bin  512 set  1 17:52 
bindrwxr-xr-x   5 root 
bin  512 set  1 17:44 
dbdrwxr-xr-x   5 root 
bin 1024 set  1 17:53 
javadrwxr-xr-x   4 root 
bin 1536 set  1 17:52 
libdrwxr-xr-x   4 root 
bin  512 set  1 17:46 
mandrwxr-xr-x   3 root 
bin  512 set  1 17:46 
msgdrwxr-xr-x  11 root 
bin 1024 set  2 12:38 
netbackupdrwxr-xr-x   2 root 
other    512 set  1 14:23 
patchdrwxr-xr-x   2 root 
bin  512 set  1 17:47 
sharedrwxr-xr-x   2 root 
bin  512 set  1 17:47 
tmpdrwxr-xr-x   5 root 
bin  512 set  2 09:48 
vardrwxr-xr-x   8 root 
bin  512 set  1 19:16 
volmgr
 
# ls -laR 
/usr/openv/var/.:total 18drwxr-xr-x   5 
root 
bin  512 set  2 09:48 
.drwxr-xr-x  14 root 
bin  512 set  1 17:52 
..drwxr-xr-x   3 root 
bin  512 set  1 17:52 
auth-rw-r--r--   1 root 
root   9 set  2 
09:48 authorize.txt-rw-r--r--   1 root 
other   2956 dez 18  2002 
license.txtdrwx--   2 root 
other    512 jan  5 20:56 
vnetddrwxr-xr-x   3 root 
bin  512 set  1 17:52 
vxss
 
./auth:total 
42drwxr-xr-x   3 root 
bin  512 set  1 17:52 
.drwxr-xr-x   5 root 
bin  512 set  2 09:48 
..-rw-r--r--   1 root 
bin  921 out  3  
2002 methods.txt-rw-r--r--   1 root 
bin 1415 set  1 12:11 
methods_allow.txt-rw-r--r--   1 root 
bin 1599 out  1  2002 
methods_deny.txt-rw-r--r--   1 root 
bin 1459 out  1  2002 
names_allow.txt-rw-r--r--   1 root 
bin 1701 out  1  2002 
names_deny.txt-r--r--r--   1 root 
bin  965 set  1 17:52 
template.methods.txt-r--r--r--   1 root 
bin 1387 set  1 17:52 
template.methods_allow.txt-r--r--r--   1 
root bin 
1607 set  1 17:52 template.methods_deny.txt-r--r--r--   1 
root bin 
1467 set  1 17:52 template.names_allow.txt-r--r--r--   1 
root bin 
1709 set  1 17:52 template.names_deny.txtdrwxr-xr-x   4 
root other    512 
set  1 12:08 vopie
 
./auth/vopie:total 8drwxr-xr-x   4 
root other    512 
set  1 12:08 .drwxr-xr-x   3 root 
bin  512 set  1 17:52 
..drwx--   3 root 
other    512 set  1 12:08 
hasheddrwx--   3 root 
other    512 set  1 12:08 
unhashed

 
Log of command 
wget:
 
Downloaded: 184 
bytes in 1 files--18:02:33--  ftp://root:[EMAIL PROTECTED]/usr/openv/var   
=> `10.1.1.10/usr/openv/.listing'Connecting to 10.1.1.10:21... 
connected.Logging in as root ... Logged in!==> SYST ... 
done.    ==> PWD ... done.==> TYPE I ... done.  
==> CWD /usr/openv ... done.==> PORT ... done.    
==> LIST ... done.
 
    [ 
<=>  
] 903   
--.--K/s 

 
18:02:34 (192.12 
KB/s) - `10.1.1.10/usr/openv/.listing' saved [903]
 
--18:02:34--  
ftp://root:[EMAIL PROTECTED]/usr/openv/var   
=> `10.1.1.10/usr/openv/var'==> CWD not required.==> PORT ... 
done.    ==> RETR var ... No such file 
`var'.
 
FINISHED 
--18:02:34--Downloaded: 903 bytes in 1 files
 
NOTE: The ftp 
command working fine.
 
 


WGET Bug?

2005-04-04 Thread Nijs, J. de
Title: WGET Bug?






#

C:\Grabtest\wget.exe -r --tries=3 http://www.xs4all.nl/~npo/ -o C:/Grabtest/Results/log

#

--16:23:02--  http://www.xs4all.nl/%7Enpo/

   => `www.xs4all.nl/~npo/index.html'

Resolving www.xs4all.nl... 194.109.6.92

Connecting to www.xs4all.nl[194.109.6.92]:80... failed: No such file or directory.

Retrying.

#


Is WGET always aspecting a INDEX.HTML al url file for grabbing data from the WWW ?


The most URLs we want to grab are not named as index.html but have other names like:


<http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml>

<http://www.ecb.de/stats/exchange/eurofxref/html/index.en.html>

<http://www.apx.nl/marketresults.html>


Is this a problem for WGET by the way?


Kind regardst,

Peter de Nijs
DELTA N.V. afdeling Portfolio Analyse
06-45 57 29 17
06-45 57 29 17


===
Dit e-mailbericht is slechts bedoeld voor gebruik door de geadresseerde.
Dit bericht kan vertrouwelijke informatie bevatten en/of informatie die is beschermd door een beroepsgeheim.
Indien u dit bericht ontvangt terwijl dit niet voor u is bestemd, verzoeken wij u vriendelijk ons hierover
per omgaande te berichten. Bij voorbaat dank!

The information transmitted by e-mail may be privileged or confidential and protected by Law.
If you have received it in error, we would appreciate your notifying us immediately. Thank you!
===





Wget Bug

2005-04-26 Thread Arndt Humpert
Hello,

wget, win32 rel. crashes with huge files.

regards
[EMAIL PROTECTED]




___ 
Gesendet von Yahoo! Mail - Jetzt mit 250MB Speicher kostenlos - Hier anmelden: 
http://mail.yahoo.de==> Command Line
wget  -m ftp://ftp.freenet.de/pub/filepilot/windows/bildung/wikipedia/
Assert Error while mirroing a big file 

==> see ftp listing:
P:\temp\wiki\new>ftp ftp.freenet.de
Connected to ftp-0.freenet.de.
220 ftp.freenet.de FTP server ready.
User (ftp-0.freenet.de:(none)): anonymous
331 Password required.
Password:
230 Login completed.
ftp> cd pub
250 Changed working directory to "/pub".
ftp> cd filepilot
250 Changed working directory to "/pub/filepilot".
ftp> cd windows
250 Changed working directory to "/pub/filepilot/windows".
ftp> cd bildung
250 Changed working directory to "/pub/filepilot/windows/bildung".
ftp> cd wikipedia
250 Changed working directory to "/pub/filepilot/windows/bildung/wikipedia".
ftp> dir
200 PORT command ok.
150 Opening data connection.
-rw-r--r--   1 filepilo ftp 61875 Apr 11 13:06 WikiCover.pdf
-rw-r--r--   1 filepilo ftp  344804797 Apr 11 13:20 dbd_76.dbz
-rw-r--r--   1 filepilo ftp425128 Apr 08 13:34 dvdcover_wikipedia.zip
-rw-r--r--   1 filepilo ftp  2752401408 Apr 08 15:30 wp_1_2005.iso
-rw-r--r--   1 filepilo ftp  14407705 Apr 11 13:06 wpcdhtml.zip
-rw-r--r--   1 filepilo ftp  69805003 Apr 11 13:09 wpcdim.zip
-rw-r--r--   1 filepilo ftp  701104128 Apr 11 13:34 wpcdiso.iso
-rw-r--r--   1 filepilo ftp  10758083 Apr 11 13:07 wpcdmath.zip
-rw-r--r--   1 filepilo ftp  121069235 Apr 11 13:12 wpcdxml.zip
226 Transfer complete.
ftp: 632 bytes received in 0,03Seconds 19,75Kbytes/sec.
ftp> bye
221 Goodbye.

P:\temp\wiki\new>

==> Version Info
P:\temp\wiki\new>wget -V
GNU Wget 1.9.1

Copyright (C) 2003 Free Software Foundation, Inc.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Originally written by Hrvoje Niksic <[EMAIL PROTECTED]>.



==> Screen Output & Error
--11:00:40--  ftp://ftp.freenet.de/pub/filepilot/windows/bildung/wikipedia/wp_1_
2005.iso
   => `ftp.freenet.de/pub/filepilot/windows/bildung/wikipedia/wp_1_2005.
iso'
==> CWD not required.
==> PORT ... done.==> RETR wp_1_2005.iso ... done.
Length: -1,542,565,888

[ <=>  ] -1,542,565,888  122.04K/s

Assertion failed: bytes >= 0, file retr.c, line 292

abnormal program termination


rcv:[EMAIL PROTECTED]

wget bug

2005-10-03 Thread Michael C. Haller



Begin forwarded message:


From: [EMAIL PROTECTED]
Date: October 4, 2005 4:36:09 AM GMT+02:00
To: [EMAIL PROTECTED]
Subject: failure notice


Hi. This is the qmail-send program at sunsite.dk.
I'm afraid I wasn't able to deliver your message to the following  
addresses.

This is a permanent error; I've given up. Sorry it didn't work out.

:
No delivery confirmation received.

--- Below this line is a copy of the message.

Return-Path: <[EMAIL PROTECTED]>
Received: (qmail 5486 invoked from network); 27 Sep 2005 01:36:08  
-

Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88)
  by sunsite.dk with SMTP; 27 Sep 2005 01:36:08 -
Received: (qmail 70219 invoked from network); 27 Sep 2005 01:36:08  
-

X-Spam-Checker-Version: SpamAssassin 3.1.0 on a.mx.sunsite.dk
X-Spam-Level:
X-Spam-Status: No, score=-1.7 required=6.0  
tests=BAYES_00,UNPARSEABLE_RELAY,

URI_NOVOWEL autolearn=no version=3.1.0
X-Spam-Hits: -1.7
Received: from fencepost.gnu.org (199.232.76.164)
  by a.mx.sunsite.dk with SMTP; 27 Sep 2005 01:36:03 -
Received: from monty-python.gnu.org ([199.232.76.173])
by fencepost.gnu.org with esmtp (Exim 4.34)
id 1EK4O2-00074C-Et
for [EMAIL PROTECTED]; Mon, 26 Sep 2005 21:36:02 -0400
Received: from Debian-exim by monty-python.gnu.org with spam- 
scanned (Exim 4.34)

id 1EK4O1-0002JZ-Ao
for [EMAIL PROTECTED]; Mon, 26 Sep 2005 21:36:01 -0400
Received: from [84.153.95.252] (helo=mail.cilly.mine.nu)
by monty-python.gnu.org with esmtp  
(TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA:24)

(Exim 4.34)
id 1EK4O0-0002J9-Cd
for [EMAIL PROTECTED]; Mon, 26 Sep 2005 21:36:01 -0400
Received: from [172.16.17.6] (mercury.cilly.mine.nu [172.16.17.6])
(using TLSv1 with cipher RC4-SHA (128/128 bits))
(No client certificate requested)
by mail.cilly.mine.nu (Postfix) with ESMTP id 1A7511AE81E
for <[EMAIL PROTECTED]>; Tue, 27 Sep 2005 03:35:56 +0200 (CEST)
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed
Message-Id: <[EMAIL PROTECTED]>
Content-Transfer-Encoding: quoted-printable
X-Mailer: Mail Agent 1.0
From: "Michael C. Haller" <[EMAIL PROTECTED]>
Subject: wget does not encode UTF-8 properly
Date: Tue, 27 Sep 2005 03:35:54 +0200
To: [EMAIL PROTECTED]

wget does not encode UTF-8 properly

wget compiled on Mac OS X Tiger 10.4.2 build 8C46:

wget --version
GNU Wget 1.10.1

Copyright (C) 2005 Free Software Foundation, Inc.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Originally written by Hrvoje Niksic <[EMAIL PROTECTED]>.

#

--03:24:23--  http://x.dyndns.org/~x/Musik1/Faun/Zauberspru% 
cc=20=


%88che/
=3D> `x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3% 
88che/=

=20
index.html'
Resolving x.dyndns.org... 84.130.231.75
Connecting to x.dyndns.org|84.130.231.75|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%88che: Invalid =20
argumentx.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%88che/=20
index.html: No such file or directory

Cannot write to `x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%=20
88che/index.html' (No such file or directory).

FINISHED --03:24:29--
Downloaded: 0 bytes in 0 files
--03:24:29--  http://x.dyndns.org/~x/Musik1/Apocalyptica/=20
Apocalyptica/
=3D> `x.dyndns.org/~x/Musik1/Apocalyptica/=20
Apocalyptica/index.html'
Resolving x.dyndns.org... 84.130.231.75
Connecting to x.dyndns.org|84.130.231.75|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]



 0K .--03:26:42--  http://x.dyndns.org/~x/Musik1/Faun/=20
Zauberspru%cc%88che/
=3D> `x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3% 
88che/=

=20
index.html'
Resolving x.dyndns.org... 84.130.231.75
Connecting to x.dyndns.org|84.130.231.75|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%88che: Invalid =20
argumentx.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%88che/=20
index.html: No such file or directory

Cannot write to `x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%=20
88che/index.html' (No such file or directory).

FINISHED --03:26:50--
Downloaded: 0 bytes in 0 files
--03:26:50--  http://x.dyndns.org/~x/Musik1/Apocalyptica/=20
Apocalyptica/
=3D> `x.dyndns.org/~x/Musik1/Apocalyptica/=20
Apocalyptica/index.html'
Resolving x.dyndns.org... 84.130.231.75
Connecting to x.dyndns.org|84.130.231.75|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]






wget bug

2006-11-01 Thread lord maximus
well this really isn't a bug per say... but whenever you set -q for no output , it still makes a wget log file on the desktop.


wget bug

2007-05-23 Thread Highlord Ares

when I run wget on a certain sites, it tries to download web pages named
similar to http://site.com?variable=yes&mode=awesome.  However, wget isn't
saving any of these files, no doubt because of some file naming issue?  this
problem exists in both the Windows & unix versions.

hope this helps


wget bug?

2007-07-08 Thread Nikolaus_Hermanspahn
wget under win2000/win XP
I get "No such file or directory" error messages when using the follwing 
command line.

wget -s --save-headers 
"http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1&class=Arc";

%1 = 212BI
Any ideas?

thank you

Dr Nikolaus Hermanspahn
Advisor (Science)
National Radiation Laboratory
Ministry of Health
DDI: +64 3 366 5059
Fax: +64 3 366 1156

http://www.nrl.moh.govt.nz
mailto:[EMAIL PROTECTED]




Statement of confidentiality: This e-mail message and any accompanying
attachments may contain information that is IN-CONFIDENCE and subject to
legal privilege.
If you are not the intended recipient, do not read, use, disseminate,
distribute or copy this message or attachments.
If you have received this message in error, please notify the sender
immediately and delete this message.


*
This e-mail message has been scanned for Viruses and Content and cleared 
by the Ministry of Health's Content and Virus Filtering Gateway
*


maybe wget bug

2001-04-04 Thread David Christopher Asher

Hello,

I am using wget to invoke a CGI script call, while passing it several
variables.  For example:

wget -O myfile.txt
"http://user:[EMAIL PROTECTED]/myscript.cgi?COLOR=blue&SHAPE=circle"

where myscript.cgi say, makes an image based on the parameters "COLOR" and
"SHAPE".  The problem I am having is when I need to pass a key/value pair
where the value contains the "&" character.  Such as:

wget -O myfile.txt "http://user:[EMAIL PROTECTED]/myscript.cgi?COLOR=blue
& red&SHAPE=circle"

I have tried encoding the "&" as %26, but that does not seem to work (spaces
as %20 works fine).  The error log for the web server shows that the URL
requested does not say %26, but rather "&".  It does not appear to me that
wget is sending the %26 as %26, but perhaps "fixing" it to "&".

I am using GNU wget v1.5.3 with Red Hat 7.0

Thanks!

--
David Christopher Asher







Re: wget bug?

2001-06-14 Thread Jan Prikryl

"Story, Ian" wrote:

> > I have been a very happy user of wget for a long time.  However, today I
> > noticed that some sites, that don't run on port 80, don't work well with
> > wget.  For instance, when I tell wget to go get http://www.yahoo.com, it
> > automatically puts :80 at the end, like this: http://www.yahoo.com:80.
> > That is fine, most of the time, but some sites won't like that, and in
> > fact, will give a 404 error, or other errors.  So, I consulted the
> > documentation, but couldn't find a way around this...is there a
> > fix/workaround/something in the manual that I didn't see or understand to
> > get around this?  I tried a few web searches, and didn't find much
> > information...

Which version of wget do you use? I guess the last version that had this
proble was wget 1.5.3. Try 1.6 or 1.7 (but be warned that the
compilation support of SSL in 1.7 is often not working properly yet).

More information about wget may be found at http://sunsite.dk/wget/

-- jan



Re: wget bug (?)

2001-11-15 Thread Ian Abbott

On 14 Nov 2001, at 13:20, Bernard, Shawn wrote:

> I'm not sure if this is a bug or not, but when I ran this line: 
> wget -r -l2 http://www.turnerclassicmovies.com/NowPlaying/Index 
> I get this result: 
(snip)
> `www.turnerclassicmovies.com/Home/Index/0,3436,,00.html' saved [27179] 
> 
> Segmentation Fault(coredump) 

That one is fixed in the CVS repository (not that the CVS 
repository has been maintained for a few months, but that's another 
matter).

As a workaround in wget 1.7, you could try using the option
'-Gmeta', as this bug usually occurs while processing large META 
tags. The '-Gmeta' option causes wget to ignore META tags.




Re: wget bug?!

2002-02-18 Thread Ian Abbott

[The message I'm replying to was sent to <[EMAIL PROTECTED]>. I'm
continuing the thread on <[EMAIL PROTECTED]> as there is no bug and
I'm turning it into a discussion about features.]

On 18 Feb 2002 at 15:14, TD - Sales International Holland B.V. wrote:

> I've tried -w 30
> --waitretry=30
> --wait=30 (I think this one is for multiple files and the time in between 
> those though)
> 
> None of these seem to make wget wanna wait for 30 secs before trying again. 
> Like this I'm hammering the server.

The --waitretry option will wait for 1 second for the first retry,
then 2 seconds, 3 seconds, etc. up to the value specified. So you
may consider the first few retry attempts to be hammering the
server but it will gradually back off.

It sounds like you want an option to specify the initial retry
interval (currently fixed at 1 second), but Wget currently has no
such option, nor an option to change the amount it increments by
for each retry attempt (also currently fixed at 1 second).

If such features were to be added, perhaps it could work something
like this:

--waitretry=n - same as --waitretry=n,1,1
--waitretry=n,m   - same as --waitretry=n,m,1
--waitretry=n,m,i - wait m seconds for the first retry,
incrementing by i seconds for subsequent
retries up to a maximum of n seconds

The disadvantage of doing it that way is that no-one will remember
which order the numbers should appear, so an alternative is to
leave --waitretry alone and supplement it with --waitretryfirst
and --waitretryincr options.



Re: wget bug?!

2002-02-19 Thread TD - Sales International Holland B.V.

On Monday 18 February 2002 17:52, you wrote:

That would be great. The prob is that I'm using it to retrieve files mostly 
on servers that are having too much users. No I don't want to hammer the 
server but I do want to keep on trying with reasonable intervals until I get 
the file.

I think the feature would be usuable in other scenarios as well. You now have 
--waitretry and --wait, in my personal opinion the best would perhaps be to 
add --waitint(er)(val) or perhaps just --int(er)(val)

Anyways, thanks for the reply.

Kind regards,

Ferry van Steen

> [The message I'm replying to was sent to <[EMAIL PROTECTED]>. I'm
> continuing the thread on <[EMAIL PROTECTED]> as there is no bug and
> I'm turning it into a discussion about features.]
>
> On 18 Feb 2002 at 15:14, TD - Sales International Holland B.V. wrote:
> > I've tried -w 30
> > --waitretry=30
> > --wait=30 (I think this one is for multiple files and the time in between
> > those though)
> >
> > None of these seem to make wget wanna wait for 30 secs before trying
> > again. Like this I'm hammering the server.
>
> The --waitretry option will wait for 1 second for the first retry,
> then 2 seconds, 3 seconds, etc. up to the value specified. So you
> may consider the first few retry attempts to be hammering the
> server but it will gradually back off.
>
> It sounds like you want an option to specify the initial retry
> interval (currently fixed at 1 second), but Wget currently has no
> such option, nor an option to change the amount it increments by
> for each retry attempt (also currently fixed at 1 second).
>
> If such features were to be added, perhaps it could work something
> like this:
>
> --waitretry=n - same as --waitretry=n,1,1
> --waitretry=n,m   - same as --waitretry=n,m,1
> --waitretry=n,m,i - wait m seconds for the first retry,
> incrementing by i seconds for subsequent
> retries up to a maximum of n seconds
>
> The disadvantage of doing it that way is that no-one will remember
> which order the numbers should appear, so an alternative is to
> leave --waitretry alone and supplement it with --waitretryfirst
> and --waitretryincr options.



wget bug (overflow)

2002-02-26 Thread Vasil Dimov

fbsd1 --- http wget eshop.tar (3.3G) ---> fbsd2

command was:

# wget http://kamenica/eshop.tar

at the second G i got the following:

2097050K .. .. .. .. ..  431.03 KB/s
2097100K .. .. .. .. ..8.14 MB/s
2097150K .. .. .. .. ..3.76 MB/s
-2097104K .. .. .. .. ..   12.21 MB/s
-2097054K .. .. .. .. ..8.14 MB/s
...

so i did nothing, seeing that everything continues "normally".

but at the end i got:

-684104K .. .. .. .. ..1.74 MB/s
-684054K   0.00 B/s
assertion "bytes >= 0" failed: file "retr.c", line 254
Abort trap (core dumped)

# wget -V
GNU Wget 1.8.1

# uname -a
FreeBSD vihren.etrade.xx 4.5-STABLE FreeBSD 4.5-STABLE #0: Sat Feb 23 16:54:34 EET 
2002 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/VIHREN  i386

im not sending u the wget.core because the problem is obvious, according
to me. i can repeat that hack if u want that wget.core or some more
debbuging info.
file is 3594496000 bytez and was copied successfully:

kamenica:~# md5 eshop.tar
MD5 (eshop.tar) = f1709dcad40073b8c8624a8e100d7697

vihren:~# md5 eshop.tar
MD5 (eshop.tar) = f1709dcad40073b8c8624a8e100d7697



Re: wget bug

2002-11-05 Thread Jeremy Hetzler
At 09:20 AM 11/5/2002 -0700, Jing Ping Ye wrote:

Dear Sir:
I tried to use "wget" download data from ftp site but got error message as 
following:
> 
wget 
ftp://ftp.ngdc.noaa.gov/pub/incoming/RGON/anc_1m.OCT 

Screen show:
-- 

--09:02:40-- 
ftp://ftp.ngdc.noaa.gov/pub/incoming/RGON/anc_1m.OCT 

   => `anc_1m.OCT'
Resolving ftp.ngdc.noaa.gov... done.
Connecting to ftp.ngdc.noaa.gov[140.172.180.164]:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.==> PWD ... done.
==> TYPE I ... done.  ==> CWD /pub/incoming/RGON ... done.
==> PORT ... done.==> RETR anc_1m.OCT ...
Error in server response, closing control connection.
Retrying.

Using the -d switch reveals that the server refuses to send the file due to 
insufficient permissions:

200 PORT command successful.
done.==> RETR anc_1m.OCT ...
--> RETR anc_1m.OCT

550 anc_1m.OCT: Permission denied.

No such file `anc_1m.OCT'.



But when I use ftp  ( ftp ftp.ngdc.noaa.gov), I can get data.


False.


$ ftp
ftp> open ftp.ngdc.noaa.gov
Connected to ftp.ngdc.noaa.gov.
220 apex FTP server (Version wu-2.6.1(1) Thu Nov 29 13:24:22 MST 2001) ready.
Name (ftp.ngdc.noaa.gov:**): anonymous
331 Guest login ok, send your complete e-mail address as password.
Password:
230-Please read the file README.txt
230-  it was last modified on Thu Jan  6 07:55:46 2000 - 1033 days ago
230 Guest login ok, access restrictions apply.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd /pub/incoming/RGON
250 CWD command successful.
ftp> pwd
257 "/pub/incoming/RGON" is current directory.
ftp> get anc_1m.OCT
200 PORT command successful.
550 anc_1m.OCT: Permission denied.


This is not a bug in wget.






Re: wget bug

2003-09-26 Thread DervishD
Hi Jack :)

 * Jack Pavlovsky <[EMAIL PROTECTED]> dixit:
> It's probably a bug:
> bug: when downloading 
> wget -mirror ftp://somehost.org/somepath/3acv14~anivcd.mpg, 
>  wget saves it as-is, but when downloading
> wget ftp://somehost.org/somepath/3*, wget saves the files as 
> 3acv14%7Eanivcd.mpg

Yes, it *was* a bug. The lastest prerelease has it fixed. Don't
know if the tarball has the latest patches, ask Hvroje. But if you
are not in a hurry, just wait for 1.9 to be released.

> The human knowledge belongs to the world

True ;))

Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


Re: wget bug

2003-09-26 Thread Hrvoje Niksic
Jack Pavlovsky <[EMAIL PROTECTED]> writes:

> It's probably a bug: bug: when downloading wget -mirror
> ftp://somehost.org/somepath/3acv14~anivcd.mpg, wget saves it as-is,
> but when downloading wget ftp://somehost.org/somepath/3*, wget saves
> the files as 3acv14%7Eanivcd.mpg

Thanks for the report.  The problem here is that Wget tries to be
"helpful" by encoding unsafe characters in file names to %XX, as is
done in URLs.  Your first example works because of an oversight (!) 
that actually made Wget behave as you expected.

The good news is that the "helpfulness" has been rethought for the
next release and is no longer there, at least not for ordinary
characters like "~" and " ".  Try getting the latest CVS sources, they
should work better in this regard.  (http://wget.sunsite.dk/ explains
how to download the source from CVS.)


Re: Wget Bug

2003-11-10 Thread Hrvoje Niksic
The problem is that the server replies with "login incorrect", which
normally means that authorization has failed and that further retries
would be pointless.  Other than having a natural language parser
built-in, Wget cannot know that the authorization is in fact correct,
but that the server happens to be busy.

Maybe Wget should have an option to retry even in the case of (what
looks like) a login incorrect FTP response.


Re: Wget Bug

2003-11-10 Thread Hrvoje Niksic
"Kempston" <[EMAIL PROTECTED]> writes:

> Yeah, i understabd that, but lftp hadles it fine even without
> specifying any additional option ;)

But then lftp is hammering servers when real unauthorized entry
occurs, no?

> I`m sure you can work something out

Well, I'm satisfied with what Wget does now.  :-)


Re: wget bug

2004-01-12 Thread Hrvoje Niksic
Kairos <[EMAIL PROTECTED]> writes:

> $ cat wget.exe.stackdump
[...]

What were you doing with Wget when it crashed?  Which version of Wget
are you running?  Was it compiled for Cygwin or natively for Windows?


wget bug report

2004-03-26 Thread Corey Henderson
I sent this message to [EMAIL PROTECTED] as directed in the wget man page, but it 
bounced and said to try this email address.

This bug report is for GNU Wget 1.8.2 tested on both RedHat Linux 7.3 and 9

rpm -q wget
wget-1.8.2-9

When I use a wget with the -S to show the http headers, and I use the spider switch as 
well, it gives me a 501 error on some servers.

The main example I have found was doing it against a server running ntop.

http://www.ntop.org/

You can find an RPM for it at:

http://rpm.pbone.net/index.php3/stat/4/idpl/586625/com/ntop-2.2-0.dag.rh90.i386.rpm.html

You cean search with other parameters at rpm.pbone.net to get ntop for other version 
of linux

So here is the command and output:

wget -S --spider http://SERVER_WITH_NTOP:3000

HTTP request sent, awaiting response...
 1 HTTP/1.0 501 Not Implemented
 2 Date: Sat, 27 Mar 2004 07:08:24 GMT
 3 Cache-Control: no-cache
 4 Expires: 0
 5 Connection: close
 6 Server: ntop/2.2 (Dag Apt RPM Repository) (i686-pc-linux-gnu)
 7 Content-Type: text/html
21:11:56 ERROR 501: Not Implemented.

I get a 501 error. echoing the $? shows an exit status of 1

When I don't use the spider, I get the following:

wget -S http://SERVER_WITH_NTOP:3000

HTTP request sent, awaiting response...
 1 HTTP/1.0 200 OK
 2 Date: Sat, 27 Mar 2004 07:09:31 GMT
 3 Cache-Control: max-age=3600, must-revalidate, public
 4 Connection: close
 5 Server: ntop/2.2 (Dag Apt RPM Repository) (i686-pc-linux-gnu)
 6 Content-Type: text/html
 7 Last-Modified: Mon, 17 Mar 2003 20:27:49 GMT
 8 Accept-Ranges: bytes
 9 Content-Length: 1214

100%[==>]
 1,214  1.16M/sETA 00:00

21:13:04 (1.16 MB/s) - `index.html' saved [1214/1214]



The exit status was 0 and the index.html file was downloaded.

If this is a bug please fix it in your next release of wget. If it is not a bug, I 
would appriciate a brief explination as to why.

Thank You

Corey Henderson
Chief Programmer
GlobalHost.com

Re: WGET Bug?

2005-04-04 Thread Hrvoje Niksic
"Nijs, J. de" <[EMAIL PROTECTED]> writes:

> #
> C:\Grabtest\wget.exe -r --tries=3 http://www.xs4all.nl/~npo/ -o
> C:/Grabtest/Results/log
> #
> --16:23:02--  http://www.xs4all.nl/%7Enpo/
>   => `www.xs4all.nl/~npo/index.html'
> Resolving www.xs4all.nl... 194.109.6.92
> Connecting to www.xs4all.nl[194.109.6.92]:80... failed: No such file or
> directory.
> Retrying.
> #
>
> Is WGET always aspecting a INDEX.HTML al url file for grabbing data
> from the WWW ?

No, what you see is the result of two different things:

1. Wget uses "index.html" as the file name when one is missing from
   the URL because it ends with an empty path component.

2. Wget 1.9.1 (and previous versions) doesn't correctly display
   Winsock error messages, such as "connection refused".  The error
   message you're seeing doesn't reflect what really happened.

In this case, only issue #2 is a real bug.  It has been fixed in the
CVS version, which is unfortunately not yet available as a Windows
binary.


Re: Wget Bug

2005-04-26 Thread Hrvoje Niksic
Arndt Humpert <[EMAIL PROTECTED]> writes:

> wget, win32 rel. crashes with huge files.

Thanks for the report.  This problem has been fixed in the latest
version, available at http://xoomer.virgilio.it/hherold/ .


wget bug report

2005-06-12 Thread A.Jones
Sorry for the crosspost, but the wget Web site is a little confusing on the 
point of where to send bug reports/patches.

Just installed wget 1.10 on Friday. Over the weekend, my scripts failed with 
the 
following error (once for each wget run):
Assertion failed: wget_cookie_jar != NULL, file http.c, line 1723
Abort - core dumped

All of my command lines are similar to this:
/home/programs/bin/wget -q --no-cache --no-cookies -O /home/programs/etc/alte_se
iten/xsr.html 'http://www.enterasys.com/download/download.cgi?lib=XSR'

After taking a look at it, i implemented the following change to http.c and 
tried again. It works for me, but i don't know what other implications my 
change 
might have.

--- http.c.orig Mon Jun 13 08:04:23 2005
+++ http.c  Mon Jun 13 08:06:59 2005
@@ -1715,6 +1715,7 @@
   hs->remote_time = resp_header_strdup (resp, "Last-Modified");
 
   /* Handle (possibly multiple instances of) the Set-Cookie header. */
+  if (opt.cookies)
   {
 char *pth = NULL;
 int scpos;


Mit freundlichen Grüßen

MVV Energie AG
Abteilung AI.C

Andrew Jones

Telefon: +49 621 290-3645
Fax: +49 621 290-2677
E-Mail: [EMAIL PROTECTED] Internet: www.mvv.de
MVV Energie · Luisenring 49 · 68159 Mannheim
Handelsregister-Nr. HRB 1780
Vorsitzender des Aufsichtsrates: Oberbürgermeister Gerhard Widder
Vorstand: Dr. Rudolf Schulten (Vorsitzender) · Dr. Werner Dub · Hans-Jürgen 
Farrenkopf · Karl-Heinz Trautmann


RE: wget bug

2007-05-23 Thread Willener, Pat
This does not look like a valid URL to me - shouldn't there be a slash at the 
end of the domain name?
 
Also, when talking about a bug (or anything else), it is always helpful if you 
specify the wget version (number).



From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Highlord Ares
Sent: Thursday, May 24, 2007 11:41
To: [EMAIL PROTECTED]
Subject: wget bug


when I run wget on a certain sites, it tries to download web pages named 
similar to http://site.com?variable=yes&mode=awesome.  However, wget isn't 
saving any of these files, no doubt because of some file naming issue?  this 
problem exists in both the Windows & unix versions. 

hope this helps



RE: wget bug

2007-05-24 Thread Tony Lewis
Highlord Ares wrote:

 

> it tries to download web pages named similar to

>  
http://site.com?variable=yes&mode=awesome

 

Since "&" is a reserved character in many command shells, you need to quote
the URL on the command line:

 

wget " 
http://site.com?variable=yes&mode=awesome";

 

Tony

 



Re: wget bug?

2007-07-09 Thread Mauro Tortonesi
On Mon, 9 Jul 2007 15:06:52 +1200
[EMAIL PROTECTED] wrote:

> wget under win2000/win XP
> I get "No such file or directory" error messages when using the follwing 
> command line.
> 
> wget -s --save-headers 
> "http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1&class=Arc";
> 
> %1 = 212BI
> Any ideas?

hi nikolaus,

in windows, you're supposed to use %VARIABLE_NAME% for variable substitution. 
try using %1% instead of %1.

-- 
Mauro Tortonesi <[EMAIL PROTECTED]>


Re: wget bug?

2007-07-09 Thread Matthias Vill

Mauro Tortonesi schrieb:

On Mon, 9 Jul 2007 15:06:52 +1200
[EMAIL PROTECTED] wrote:


wget under win2000/win XP
I get "No such file or directory" error messages when using the follwing 
command line.


wget -s --save-headers 
"http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1&class=Arc";

%1 = 212BI
Any ideas?


hi nikolaus,

in windows, you're supposed to use %VARIABLE_NAME% for variable substitution. 
try using %1% instead of %1.



AFAIK it's ok to use %1, because it is a special case. Also the error 
would be a 404 or some wget error in that case the variable gets 
substituted in a wrong way or not? (actually even than you get a 200 
response with that url)


I just tried using the command inside a batch-file and came across 
another problem: You used a lowercase -s wich is not recognized by my 
wget-version, but a uppercase -S is. i guess you should change that.


I would guess wget is not in your PATH.
Try using "c:\path\to\the dircetory\wget.exe" instead of just wget.

If this too does not hel at explicit "--restrict-file-names=windows" to 
your options, so wget does not try to use the ? inside a filename. 
(normally not needed)


So a should-work-for-all-means-version is

"c:\path\wget.exe" -S --save-headers --restrict-file-names=windows 
"http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1&class=Arc";


Of course just one line, but my dump mail-editor wrapped it.

Greetings
Matthias


Re: WGET bug...

2008-07-11 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

HARPREET SAWHNEY wrote:
> Hi,
> 
> I am getting a strange bug when I use wget to download a binary file
> from a URL versus when I manually download.
> 
> The attached ZIP file contains two files:
> 
> 05.upc --- manually downloaded
> dum.upc--- downloaded through wget
> 
> wget adds a number of ascii characters to the head of the file and seems
> to delete a similar number from the tail.
> 
> So the file sizes are the same but the addition and deletion renders
> the file useless.
> 
> Could you please direct me on if I should be using some specific
> option to avoind this problem?

In the future, it's useful to mention which version of Wget you're using.

The problem you're having is that the server is adding the extra HTML at
the front of your session, and then giving you the file contents anyway.
It's a bug in the PHP code that serves the file.

You're getting this extra content because you are not logged in when
you're fetching it. You need to have Wget send a cookie with an
login-session information, and then the server will probably stop
sending the corrupting information at the head of the file. The site
does not appear to use HTTP's authentication mechanisms, so the
<[EMAIL PROTECTED]> bit in the URL doesn't do you any good. It uses
Forms-and-cookies authentication.

Hopefully, you're using a browser that stores its cookies in a text
format, or that is capable of exporting to a text format. In that case,
you can just ensure that you're logged in in your browser, and use the
- --load-cookies= option to Wget to use the same session
information.

Otherwise, you'll need to use --save-cookies with Wget to simulate the
login form post, which is tricky and requires some understanding of HTML
Forms.

- --
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFId9Vy7M8hyUobTrERAjCWAJ9niSjC5YdBDNcAbnBFWZX6D8AO7gCeM8nE
i8jn5i5Y6wLX1g3Q2hlDgcM=
=uOke
-END PGP SIGNATURE-


Re: WGET bug...

2008-07-11 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

HARPREET SAWHNEY wrote:
> Hi,
> 
> Thanks for the prompt response.
> 
> I am using
> 
> GNU Wget 1.10.2
> 
> I tried a few things on your suggestion but the problem remains.
> 
> 1. I exported the cookies file in Internet Explorer and specified
> that in the Wget command line. But same error occurs.
> 
> 2. I have an open session on the site with my username and password.
> 
> 3. I also tried running wget while I am downloading a file from the
> IE session on the site, but the same error.

Sounds like you'll need to get the appropriate cookie by using Wget to
login to the website. This requires site-specific information from the
user-login form page, though, so I can't help you without that.

If you know how to read some HTML, then you can find the HTML form used
for posting username/password stuff, and use

wget --keep-session-cookies --save-cookies=cookies.txt \
- --post-data='username=foo&password=bar' ACTION

Where ACTION is the value of the form's action field, USERNAME and
PASSWORD (and possibly further required values) are field names from the
HTML form, and FOO and BAR is the username/password.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFId+w97M8hyUobTrERAmLsAJ91231iGeO/albrgRuuUCRp8zFcnwCgiX3H
fDp2J2oTBKlxW17eQ2jaCAA=
=Khmi
-END PGP SIGNATURE-


Re: maybe wget bug

2001-04-18 Thread Hack Kampbjørn

David Christopher Asher wrote:
> 
> Hello,
> 
> I am using wget to invoke a CGI script call, while passing it several
> variables.  For example:
> 
> wget -O myfile.txt
> "http://user:[EMAIL PROTECTED]/myscript.cgi?COLOR=blue&SHAPE=circle"
> 
> where myscript.cgi say, makes an image based on the parameters "COLOR" and
> "SHAPE".  The problem I am having is when I need to pass a key/value pair
> where the value contains the "&" character.  Such as:
> 
> wget -O myfile.txt "http://user:[EMAIL PROTECTED]/myscript.cgi?COLOR=blue
> & red&SHAPE=circle"
> 
> I have tried encoding the "&" as %26, but that does not seem to work (spaces
> as %20 works fine).  The error log for the web server shows that the URL
> requested does not say %26, but rather "&".  It does not appear to me that
> wget is sending the %26 as %26, but perhaps "fixing" it to "&".

You have hit one of Wget "features", it is overzealous in converting
URLs into canonical form. As you have discovered Wget first converts all
encoded characters back to their real value and then encodes all those
that are unsafe sending in URLs.

I  know of no workaround for this "feature". I tried encoding the '%'
yet another time like '%2526' but Wget then send the whole thing '%2526'
8-(

> 
> I am using GNU wget v1.5.3 with Red Hat 7.0

Note that the latest version is 1.6 available from a GNU mirror near
you. See the web-site (http://sunsite.dk/wget/) for directions.

> 
> Thanks!
> 
> --
> David Christopher Asher

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: maybe wget bug

2001-04-23 Thread Hrvoje Niksic

Hack Kampbjørn <[EMAIL PROTECTED]> writes:

> You have hit one of Wget "features", it is overzealous in converting
> URLs into canonical form. As you have discovered Wget first converts
> all encoded characters back to their real value and then encodes all
> those that are unsafe sending in URLs.

It's a bug.  The correct solution has been proposed by Anon
Sricharoenchai and I've implemented the function, but it will take
some time to integrate it into Wget.



Re: [Wget]: Bug submission

2001-12-29 Thread Hrvoje Niksic

[ Please mail bug reports to <[EMAIL PROTECTED]>, not to me directly. ]

Nuno Ponte <[EMAIL PROTECTED]> writes:

> I get a segmentation fault when invoking:
> 
> wget -r
> http://java.sun.com/docs/books/performance/1st_edition/html/JPTOC.fm.html
> 
> My Wget version is 1.7-3, the one which is bundled with RedHat
> 7.2. I attached my .wgetrc.

Wget 1.7 is fairly old -- it was followed by a bugfix 1.7.1 release,
and then 1.8 and 1.8.1.  Please try upgrading to the latest version,
1.8.1, and see if the bug repeats.  I couldn't repeat it with 1.8.1.



Re: wget bug (overflow)

2002-04-15 Thread Hrvoje Niksic

I'm afraid that downloading files larger than 2G is not supported by
Wget at the moment.



wget bug: directory overwrite

2004-04-05 Thread Juhana Sadeharju
Hello.

Problem: When downloading all in
   http://udn.epicgames.com/Technical/MyFirstHUD
wget overwrites the downloaded MyFirstHUD file with
MyFirstHUD directory (which comes later).

GNU Wget 1.9.1
wget -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np -U Mozilla $@

Solution: Use of -E option.

Regards,
Juhana


Re: wget bug report

2005-06-24 Thread Hrvoje Niksic
<[EMAIL PROTECTED]> writes:

> Sorry for the crosspost, but the wget Web site is a little confusing
> on the point of where to send bug reports/patches.

Sorry about that.  In this case, either address is fine, and we don't
mind the crosspost.

> After taking a look at it, i implemented the following change to
> http.c and tried again. It works for me, but i don't know what other
> implications my change might have.

It's exactly the correct change.  A similar fix has already been
integrated in the CVS (in fact subversion) code base.

Thanks for the report and the patch.


wget bug with ftp/passive

2004-01-21 Thread don
Hello,
I think I've come across a little bug in wget when using it to get a file
via ftp.

I did not specify the "passive" option, yet it appears to have been used
anyway Here's a short transcript:

[EMAIL PROTECTED] sim390]$ wget ftp://musicm.mcgill.ca/sim390/sim390dm.zip
--21:05:21--  ftp://musicm.mcgill.ca/sim390/sim390dm.zip
   => `sim390dm.zip'
Resolving musicm.mcgill.ca... done.
Connecting to musicm.mcgill.ca[132.206.120.4]:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.==> PWD ... done.
==> TYPE I ... done.  ==> CWD /sim390 ... done.
==> PASV ...
Cannot initiate PASV transfer.
==> PORT ... done.==> RETR sim390dm.zip ... done.

[EMAIL PROTECTED] sim390]$ man wget


As you can see, PASV was attempted (and failed)

I was looking for an option to prevent passive mode.

[EMAIL PROTECTED] sim390]$ wget --version
GNU Wget 1.8.2

Running on Fedora Core 1.

Regards,
Don Russell


wget bug with large files

2004-12-10 Thread Roberto Sebastiano
I got a crash in wget downloading a large iso file (2,4 GB)


newdeal:/pub/isos# wget -c
ftp://ftp.belnet.be/linux/fedora/linux/core/3/i386/iso/FC3-i386-DVD.iso
--09:22:17--
ftp://ftp.belnet.be/linux/fedora/linux/core/3/i386/iso/FC3-i386-DVD.iso
   => `FC3-i386-DVD.iso'
Resolving ftp.belnet.be... 193.190.198.20
Connecting to ftp.belnet.be[193.190.198.20]:21... connected.
Accesso come utente anonymous ... Login eseguito!
==> SYST ... fatto.   ==> PWD ... fatto.
==> TYPE I ... fatto.  ==> CWD /linux/fedora/linux/core/3/i386/iso ...
fatto.
==> SIZE FC3-i386-DVD.iso ... fatto.
==> PASV ... fatto.   ==> REST 2079173504 ... fatto.
==> RETR FC3-i386-DVD.iso ... fatto.

100%[+=>] 2,147,470,560   60.39K/s
ETA 00:00wget: progress.c:704: create_image: Assertion `insz <= dlsz'
failed.
Aborted


then I tried to resume the download ..

newdeal:/pub/isos# wget -c
ftp://ftp.belnet.be/linux/fedora/linux/core/3/i386/iso/FC3-i386-DVD.iso
--09:41:40--
ftp://ftp.belnet.be/linux/fedora/linux/core/3/i386/iso/FC3-i386-DVD.iso
   => `FC3-i386-DVD.iso'
Resolving ftp.belnet.be... 193.190.198.20
Connecting to ftp.belnet.be[193.190.198.20]:21... connected.
Accesso come utente anonymous ... Login eseguito!
==> SYST ... fatto.   ==> PWD ... fatto.
==> TYPE I ... fatto.  ==> CWD /linux/fedora/linux/core/3/i386/iso ...
fatto.
==> SIZE FC3-i386-DVD.iso ... fatto.
==> PASV ... fatto.   ==> REST -2147476576 ... 
REST fallito, ricomincio dall'inizio. (restarting from beginning)
==> RETR FC3-i386-DVD.iso ... fatto.

[
<=>] 551,648
63.87K/s


Here it deleted the old iso image (2,1GB downloaded) and started from
the beginning .. shouldn't it save the new file with a .1 suffix ?



Let me know if I can help you tracking this bug


Thanks,
-- 
Roberto Sebastiano <[EMAIL PROTECTED]>



wget BUG: ftp file retrieval

2005-11-25 Thread Arne Caspari

Hello,

current wget seems to have the following bug in the ftp retrieval code:

When called like:
wget user:[EMAIL PROTECTED]/foo/bar/file.tgz

and foo or bar is a read/execute protected directory while file.tgz is 
user-readable, wget fails to retrieve the file because it tries to CWD 
into the directory first.


I think the correct behaviour should be not to CWD into the directory 
but to issue a GET request with the full path instead ( which will 
succeed ).


Best regards,

Arne Caspari



wget bug - after closing control connection

2001-03-08 Thread Cezary Sobaniec

Hello,

I've found a (less important) bug in wget.  I've been dowloading 
a file from FTP server and the control connection of the FTP service 
was closed by the server.  After that wget started to print incorrectly
progress information (beyond 100%).

The log follows:
_

# wget -nd ftp://ftp.suse.com/pub/suse/i386/update/7.0/n1/mod_php.rpm
--12:30:48--  ftp://ftp.suse.com:21/pub/suse/i386/update/7.0/n1/mod_php.rpm
   => `mod_php.rpm'
Connecting to ftp.suse.com:21... connected!
Logging in as anonymous ... Logged in!
==> TYPE I ... done.  ==> CWD pub/suse/i386/update/7.0/n1 ... done.
==> PORT ... done.==> RETR mod_php.rpm ... done.
Length: 1,599,213 (unauthoritative)

0K -> .. .. .. .. .. [  3%]
   50K -> .. .. .. .. .. [  6%]
  100K -> .. .. .. .. .. [  9%]
  150K -> .. .. .. .. .. [ 12%]
  200K -> .. .. .. .. .. [ 16%]
  250K -> .. .. .. .. .. [ 19%]
  300K -> .. .. .. .. .. [ 22%]
  350K -> .. .. .. .. .. [ 25%]
  400K -> .. .. .. .. .. [ 28%]
  450K -> .. .. .. .. .. [ 32%]
  500K -> .. .. .. .. .. [ 35%]
  550K -> .. .. .[ 36%]

12:41:36 (916.90 B/s) - Control connection closed.
Retrying.

--12:50:38--  ftp://ftp.suse.com:21/pub/suse/i386/update/7.0/n1/mod_php.rpm
  (try: 3) => `mod_php.rpm'
Connecting to ftp.suse.com:21... connected!
Logging in as anonymous ... Logged in!
==> TYPE I ... done.  ==> CWD pub/suse/i386/update/7.0/n1 ... done.
==> PORT ... done.==> REST 626688 ... done.
==> RETR mod_php.rpm ... done.
Length: 972,525 [345,837 to go] (unauthoritative)

  [ skipping 600K ]
  600K -> ,, ,, .. .. .. [ 68%]
  650K -> .. .. .. .. .. [ 72%]

12:57:59 (187.36 B/s) - Control connection closed.
Retrying.

--12:57:59--  ftp://ftp.suse.com:21/pub/suse/i386/update/7.0/n1/mod_php.rpm
  (try: 4) => `mod_php.rpm'
Connecting to ftp.suse.com:21... connected!
Logging in as anonymous ... Logged in!
==> TYPE I ... done.  ==> CWD pub/suse/i386/update/7.0/n1 ... done.
==> PORT ... done.==> REST 708608 ... done.
==> RETR mod_php.rpm ... done.
Length: 890,605 [181,997 to go] (unauthoritative)

  [ skipping 650K ]
  650K -> ,, ,, ,, ,, ,, [ 80%]
  700K -> .. .. .. .. .. [ 86%]
  750K -> .. .. .. .. .. [ 91%]
  800K -> .. .. .. .. .. [ 97%]
  850K -> .. .. .. .. .. [103%]
  900K -> .. .. .. .. .. [109%]
  950K -> .. .. .. .. .. [114%]
 1000K -> .. .. .. .. .. [120%]
 1050K -> .. .. .. .. .. [126%]
 1100K -> .. .. .. .. .. [132%]
 1150K -> .. .. .. .. .. [137%]

-- 
("`-''-/").___..--''"`-._  Cezary Sobaniec
 `6_ 6  )   `-.  ( ).`-.__.')  Institute of Computing Science
 (_Y_.)'  ._   )  `._ `. ``-..-'   Poznan University of Technology
   _..`--'_..-_/  /--'_.' ,'   [EMAIL PROTECTED]
  (il).-''  (li).'  ((!.-' tel. (+48 61) 665-28-09



Re: wget bug with ftp/passive

2004-01-22 Thread Hrvoje Niksic
don <[EMAIL PROTECTED]> writes:

> I did not specify the "passive" option, yet it appears to have been used
> anyway Here's a short transcript:
>
> [EMAIL PROTECTED] sim390]$ wget ftp://musicm.mcgill.ca/sim390/sim390dm.zip
> --21:05:21--  ftp://musicm.mcgill.ca/sim390/sim390dm.zip
>=> `sim390dm.zip'
> Resolving musicm.mcgill.ca... done.
> Connecting to musicm.mcgill.ca[132.206.120.4]:21... connected.
> Logging in as anonymous ... Logged in!
> ==> SYST ... done.==> PWD ... done.
> ==> TYPE I ... done.  ==> CWD /sim390 ... done.
> ==> PASV ...
> Cannot initiate PASV transfer.

Are you sure that something else hasn't done it for you?  For example,
a system-wide initialization file `/usr/local/etc/wgetrc' or
`/etc/wgetrc'.


Re: wget bug with ftp/passive

2004-08-12 Thread Jeff Connelly
On Wed, 21 Jan 2004 23:07:30 -0800, you wrote:
>Hello,
>I think I've come across a little bug in wget when using it to get a file
>via ftp.
>
>I did not specify the "passive" option, yet it appears to have been used
>anyway Here's a short transcript:
Passive FTP can be specified in /etc/wgetrc or /usr/local/etc/wgetrc, and then
its impossible to turn it off. There is no --active-mode flag as far
as I can tell.

I submitted a patch to wget-patches under the title of 
"Patch to add --active-ftp and make --passive-ftp default", which does
what it says.
Your configuration is setting passive mode to default, but the stock
wget defaults
to active (active mode doesn't work too well behind some firewalls).
--active-ftp is
a very useful option in these cases.

Last I checked, the patch hasn't been committed. I can't find the wget-patches
mail archives anywhere, either. So I'll paste it here, in hopes that it helps.

-Jeff Connelly

=cut here=
Common subdirectories: doc.orig/ChangeLog-branches and doc/ChangeLog-branches
diff -u doc.orig/wget.pod doc/wget.pod
--- doc.orig/wget.pod   Wed Jul 21 20:17:29 2004
+++ doc/wget.podWed Jul 21 20:18:56 2004
@@ -888,12 +888,17 @@
 system-specific.  This is why it currently works only with Unix FTP
 servers (and the ones emulating Unix C output).

+=item B<--active-ftp>
+
+Use the I FTP retrieval scehme, in which the server
+initiates the data connection. This is sometimes required to connect
+to FTP servers that are behind firewalls.

 =item B<--passive-ftp>

 Use the I FTP retrieval scheme, in which the client
 initiates the data connection.  This is sometimes required for FTP
-to work behind firewalls.
+to work behind firewalls, and as such is enabled by default.


 =item B<--retr-symlinks>
Common subdirectories: src.orig/.libs and src/.libs
Common subdirectories: src.orig/ChangeLog-branches and src/ChangeLog-branches
diff -u src.orig/init.c src/init.c
--- src.orig/init.c Wed Jul 21 20:17:33 2004
+++ src/init.c  Wed Jul 21 20:17:59 2004
@@ -255,6 +255,7 @@
   opt.ftp_glob = 1;
   opt.htmlify = 1;
   opt.http_keep_alive = 1;
+  opt.ftp_pasv = 1;
   opt.use_proxy = 1;
   tmp = getenv ("no_proxy");
   if (tmp)
diff -u src.orig/main.c src/main.c
--- src.orig/main.c Wed Jul 21 20:17:33 2004
+++ src/main.c  Wed Jul 21 20:17:59 2004
@@ -217,7 +217,8 @@
 FTP options:\n\
   -nr, --dont-remove-listing   don\'t remove `.listing\' files.\n\
   -g,  --glob=on/off   turn file name globbing on or off.\n\
-   --passive-ftp   use the \"passive\" transfer mode.\n\
+   --passive-ftp   use the \"passive\" transfer mode (default).\n\
+   --active-ftpuse the \"active\" transfer mode.\n\
--retr-symlinks when recursing, get linked-to files (not dirs).\
n\
 \n"), stdout);
   fputs (_("\
@@ -285,6 +286,7 @@
 { "no-parent", no_argument, NULL, 133 },
 { "non-verbose", no_argument, NULL, 146 },
 { "passive-ftp", no_argument, NULL, 139 },
+{ "active-ftp", no_argument, NULL, 167 },
 { "page-requisites", no_argument, NULL, 'p' },
 { "quiet", no_argument, NULL, 'q' },
 { "random-wait", no_argument, NULL, 165 },
@@ -397,6 +399,9 @@
case 139:
  setval ("passiveftp", "on");
  break;
+case 167:
+  setval ("passiveftp", "off");
+  break;
case 141:
  setval ("noclobber", "on");
  break;


wget -- bug / feature request (not sure)

2004-09-04 Thread Vlad Kudelin
Hello,

Probably I am just too lazy, haven't spent enough time to read the man, and
wget  can actually do exactly what I want.
If so -- I do apologize for taking your time.
Otherwise: THANKS for your time!..:-).

My problem is:
redirects.

I am trying to catch them by using, say, netcat ... or writing some simple
pieces of software -- sending HTTP GET and catching the "Location:" in
response. What I've found out is that (obviously) wget is wa-ay more
sophisticated and can do much better job, especially in certain cases.

I started using it by basically catching stderr from wget [params my_urls]
and then parsing it -- looking for the "^Location: " pattern.
Works great.
The downside is: performance.
You see, I don't need the actual content, -- only the canonical URL. But
wget just wgets it - no matter what.

As long as (from my perspective) this is a case of "If  Wget does not behave
as documented, it's a bug." -- according to man, -- I am taking a liberty to
'file a bug'.

(The "expected" behavior I'm talking about is this:  if I use
"--spider", I expect wget do nothing after finding the server -- like
sending GET to the server and getting HTML back).

That's my bug - and/or a feature I'd really like to have.  An alternative
would be: adding --some_flag=n, meaning "receive no more than n lines of
html").

Do you think that this could be a useful feature that other people would
probably love too?...

Thanks for your time and for a great tool,

Vlad.




Re: wget BUG: ftp file retrieval

2005-11-25 Thread Hrvoje Niksic
Arne Caspari <[EMAIL PROTECTED]> writes:

> When called like:
> wget user:[EMAIL PROTECTED]/foo/bar/file.tgz
>
> and foo or bar is a read/execute protected directory while file.tgz is
> user-readable, wget fails to retrieve the file because it tries to CWD
> into the directory first.
>
> I think the correct behaviour should be not to CWD into the
> directory but to issue a GET request with the full path instead (
> which will succeed ).

I believe that CWD is mandated by the FTP specification, but you're
also right that Wget should try both variants.  You can force Wget
into getting the file without CWD using this kludge:

wget ftp://user:[EMAIL PROTECTED]/%2Ffoo%2Fbar%2Ffile.tgz -O file.tgz


Re: wget BUG: ftp file retrieval

2005-11-25 Thread Mauro Tortonesi

Hrvoje Niksic wrote:

Arne Caspari <[EMAIL PROTECTED]> writes:

I believe that CWD is mandated by the FTP specification, but you're
also right that Wget should try both variants.


i agree. perhaps when retrieving file A/B/F.X we should try to use:

GET A/B/F.X

first, then:

CWD A/B
GET F.X

if the previous attempt failed, and:

CWD A
CDW B
GET F.X

as a last resort. what do you think?

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget BUG: ftp file retrieval

2005-11-25 Thread Arne Caspari
Thank you all for your very fast response. As a further note: When this 
error occurs, wget bails out with the following error message:

"No such directory foo/bar".

I think it should instead be "Could not access foo/bar: Permission 
denied" or similar in such a situation.


/Arne


Mauro Tortonesi wrote:


Hrvoje Niksic wrote:


Arne Caspari <[EMAIL PROTECTED]> writes:

I believe that CWD is mandated by the FTP specification, but you're
also right that Wget should try both variants.



i agree. perhaps when retrieving file A/B/F.X we should try to use:

GET A/B/F.X

first, then:

CWD A/B
GET F.X

if the previous attempt failed, and:

CWD A
CDW B
GET F.X

as a last resort. what do you think?





Re: wget BUG: ftp file retrieval

2005-11-25 Thread Hrvoje Niksic
Mauro Tortonesi <[EMAIL PROTECTED]> writes:

> Hrvoje Niksic wrote:
>> Arne Caspari <[EMAIL PROTECTED]> writes:
>>
>> I believe that CWD is mandated by the FTP specification, but you're
>> also right that Wget should try both variants.
>
> i agree. perhaps when retrieving file A/B/F.X we should try to use:
>
> GET A/B/F.X
>
> first, then:
>
> CWD A/B
> GET F.X
>
> if the previous attempt failed, and:
>
> CWD A
> CDW B
> GET F.X
>
> as a last resort. what do you think?

That might work.  Also don't prepend the necessary prepending of $CWD
to those paths.


Re: wget BUG: ftp file retrieval

2005-11-25 Thread Hrvoje Niksic
Hrvoje Niksic <[EMAIL PROTECTED]> writes:

> That might work.  Also don't prepend the necessary prepending of $CWD
> to those paths.

Oops, I meant "don't forget to prepend ...".


Re: wget BUG: ftp file retrieval

2005-11-25 Thread Steven M. Schweda
From: Hrvoje Niksic

> Also don't [forget to] prepend the necessary [...] $CWD
> to those paths.

   Or, better yet, _DO_ forget to prepend the trouble-causing $CWD to
those paths.

   As you might recall from my changes for VMS FTP servers (if you had
ever looked at them), this scheme causes no end of trouble.  A typical
VMS FTP server reports the CWD in VMS form (for example,
"SYS$SYSDEVICE:[ANONYMOUS]").  It may be willing to use a UNIX-like path
in a CWD command (for example, "CWD A/B", but it's _not_ willing to use
a mix of them (for example, SYS$SYSDEVICE:[ANONYMOUS]/A/B").

   At a minimum, a separate CWD should be used to restore the initial
directory.  After that, you can do what you wish.  On my server at least
(HP TCPIP V5.4), "GET A/B/F.X" will work, but the mixed mess is unlikely
to work on any VMS FTP server.



   Steven M. Schweda   (+1) 651-699-9818
   382 South Warwick Street[EMAIL PROTECTED]
   Saint Paul  MN  55105-2547


Re: wget BUG: ftp file retrieval

2005-11-25 Thread Daniel Stenberg

On Fri, 25 Nov 2005, Steven M. Schweda wrote:

  Or, better yet, _DO_ forget to prepend the trouble-causing $CWD to those 
paths.


I agree. What good would prepending do? It will most definately add problems 
such as those Steven describes.


--
 -=- Daniel Stenberg -=- http://daniel.haxx.se -=-
  ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol


Re: wget BUG: ftp file retrieval

2005-11-25 Thread Hrvoje Niksic
Daniel Stenberg <[EMAIL PROTECTED]> writes:

> On Fri, 25 Nov 2005, Steven M. Schweda wrote:
>
>>   Or, better yet, _DO_ forget to prepend the trouble-causing $CWD to
>> those paths.
>
> I agree. What good would prepending do?

Prepending is already there, and adding it fixed many problems with
FTP servers that log you in a non-/ working directory.


Re: wget BUG: ftp file retrieval

2005-11-25 Thread Steven M. Schweda
From: Hrvoje Niksic

> Prepending is already there,

   Yes, it certainly is, which is why I had to disable it in my code for
VMS FTP servers.

>  and adding it fixed many problems with
> FTP servers that log you in a non-/ working directory.

   Which of those problems would _not_ be fixed by my two-step CWD for a
relative path?  That is:

  1. CWD to the string which the server reported in its initial PWD
 response.

  2. CWD to the relative path in the URL ("A/B" in our current
 example).

On a VMS server, the first path is probably pure VMS, so it works, and
the second path is pure UNIX, so it also works (on all the servers I've
tried, at least).  As I remark in the (seldom-if-ever-read) comments in
my "src/ftp.c", I see no reason why this scheme would fail on any
reasonable server.  But I'm always open to a good argument, especially
if it includes a demonstration of a good counter-example.

   This (in my opinion, stinking-bad) prepending code is the worst part
of what makes the current (not-mine) VMS FTP server code so awful. 
(Running a close second is the part which discards the device name from
the initial PWD response, which led to a user complaint in this forum a
while back, involving an inability to specify a different device in a
URL.)



   Steven M. Schweda   (+1) 651-699-9818
   382 South Warwick Street[EMAIL PROTECTED]
   Saint Paul  MN  55105-2547


Re: wget BUG: ftp file retrieval

2005-11-26 Thread Hrvoje Niksic
[EMAIL PROTECTED] (Steven M. Schweda) writes:

>> and adding it fixed many problems with FTP servers that log you in
>> a non-/ working directory.
>
> Which of those problems would _not_ be fixed by my two-step CWD for
> a relative path?  That is: [...]

That should work too.  On Unix-like FTP servers, the two methods would
be equivalent.

Thanks for the suggestion.  I realized your patch contained
improvements for dealing with VMS FTP servers, but I somehow managed
to miss this explanation.


Re: wget BUG: ftp file retrieval

2005-11-26 Thread Steven M. Schweda
From: Hrvoje Niksic

> [...]  On Unix-like FTP servers, the two methods would
> be equivalent.

   Right.  So I resisted temptation, and kept the two-step CWD method in
my code for only a VMS FTP server.  My hope was that some one would look
at the method, say "That's a good idea", and change the "if" to let it
be used everywhere.

   Of course, I'm well known to be delusional in these matters.



   Steven M. Schweda   (+1) 651-699-9818
   382 South Warwick Street[EMAIL PROTECTED]
   Saint Paul  MN  55105-2547


Re: wget bug - after closing control connection

2001-03-08 Thread csaba . raduly


Which version of wget do you use ? Are you aware that wget 1.6 has been
released and 1.7 is in development (and they contain a workaround for the
"Lying FTP server syndrome" you are seeing) ?
--
Csaba Ráduly, Software Engineer  Sophos Anti-Virus
email: [EMAIL PROTECTED]   http://www.sophos.com
US support: +1 888 SOPHOS 9UK Support: +44 1235 559933






wget bug (?): --page-requisites should supercede robots.txt

2002-09-22 Thread Jamie Flournoy

Using wget 1.8.2:

$ wget --page-requisites http://news.com.com

...fails to retrieve most of the files that are required to properly 
render the HTML document, because they are forbidden by 
http://news.com.com/robots.txt .

I think that use of --page-requisites implies that wget is being used as 
a "save this entire web page as..." utility for later human viewing, 
rather than a text indexing spider that wants to analyze the content but 
not the presentation. So I believe that wget should ignore robots.txt 
when --page-requisites is specified.

If you agree then I'll try to write a patch & send it to you this 
week... please let me know if you agree or disagree. Thanks!


--- the gory bits:

   "wget -d --page-requisites http://news.com.com"; says:

appending "http://news.com.com/i/hdrs/ne/y_fd.gif"; to urlpos.

   etc., but then later says:

Deciding whether to enqueue "http://news.com.com/i/hdrs/ne/y_fd.gif";.
Rejecting path i/hdrs/ne/y_fd.gif because of rule `i/'.
Not following http://news.com.com/i/hdrs/ne/y_fd.gif because robots.txt 
forbids it.
Decided NOT to load it.






dificulty with Debian wget bug 137989 patch

2003-09-29 Thread jayme
I tried the patch Debian bug report 137989 and didnt work. Can anybody explain:
1 - why I have to make to directories for patch work: one wget-1.8.2.orig and one 
wget-1.8.2 ?
2 - why after compilation the wget still cant download the file > 2GB ?
note : I cut the patch for debian use ( the first diff ) 
Thank you
Jayme 
[EMAIL PROTECTED]


I want to report a wget bug

2004-11-24 Thread jiaming
Hello!
  I am very pleased to use wget to crawl pages. It is an excellent tool. 
Recently I find a bug in using wget, although I am not sure wether it's a bug 
or an incorrect usage. I just to want to report here.
When I use wget to mirror or recursively download a web site with -O 
option, I mean to mirror the whole site's pages in one file. But as I type 
"./wget -m -O filename http://site";, I can only save the index file of site 
into file filename. Surprisingly, when I first type "./wget -m http://site";, 
after successfully download some pages, I stop the crawling process, and this 
pages will be save to a hierachy the same as the website itself. After that, 
when I use -O option again for the same web site, the mirror option will then 
take effect. 
I will be looking forward to hearing from you ,
   Thanks


jiaming
[EMAIL PROTECTED]
  2004-11-25


wget bug: doesn't CWD after ftp failure

2006-03-05 Thread Nate Eldredge

Hi folks,

I think I have found a bug in wget where it fails to change the working 
directory when retrying a failed ftp transaction.  This is wget 1.10.2 on 
FreeBSD-6.0/amd64.


I was trying to use wget to get files from a broken ftp server which 
occasionally sends garbled responses, causing wget to get confused, 
eventually timeout, and retry the transfer.  (The failure mode which makes 
it most obvious is sending a response to PASV which lacks the initial 
numeric response code, so that wget can't recognize it.)  This is fine. 
However, when wget reconnects, it mistakenly thinks it is already in the 
appropriate directory, and it doesn't change it, reporting "CWD not 
required".  This results in it trying to fetch the file from the root 
directory instead of the correct path.


Unfortunately I can't give you access to the server in question.  I can 
sanitize the output of a wget session if you want.  However, I think the 
bug is obvious from inspection.  At ftp.c:1197 in ftp_loop_internal() we 
have


  err = getftp (u, &len, restval, con);

  if (con->csock != -1)
con->st &= ~DONE_CWD;
  else
con->st |= DONE_CWD;

This test seems clearly to be backwards.  If con->csock is -1 (i.e. the 
connection has been closed) then we must clear the DONE_CWD flag. 
Otherwise CWD has been done and we can set the flag.


Reversing the test fixes the problem.  It also causes the CWD optimization 
to actually work when it's applicable, instead of only when it isn't :)


It might be worthwhile at other spots in the code to put in an assert() to 
ensure that we have (DO_CWD || !DO_LOGIN).  Perhaps after those flags are 
set, e.g. ftp.c:1161 in ftp_loop_internal() and ftp.c:1409 in 
ftp_retrieve_list().  Also the existence of both DONE_CWD and DO_CWD may 
cause confusion and could probably cleaned up.


Thanks for working on wget!  It's a great tool.

--
Nate Eldredge
[EMAIL PROTECTED]


wget bug in finding files after disconnect

2006-11-15 Thread Paul Bickerstaff
I'm using wget version "GNU Wget 1.10.2 (Red Hat modified)" on a fedora
core5 x86_64 system (standard wget rpm). I'm also using version 1.10.2b
on a WinXP laptop. Both display the same faulty behaviour which I don't
believe was present in earlier versions of wget that I've used.

When the internet connection disconnects wget automatically tries to
redownload the file (starting from where it was disconnected).

The problem is that it is consistently failing to find the file. The
following output shows what is happening.

wget -c ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.*.tar.gz
--13:13:34--
ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.*.tar.gz
   => `.listing'
Resolving bio-mirror.jp.apan.net... 150.26.2.58
Connecting to bio-mirror.jp.apan.net|150.26.2.58|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.==> PWD ... done.
==> TYPE I ... done.  ==> CWD /pub/biomirror/blast ... done.
==> PASV ... done.==> LIST ... done.

[  <=>] 4,657 10.78K/s

13:13:43 (10.75 KB/s) - `.listing' saved [4657]

Removed `.listing'.
--13:13:43--
ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.00.tar.gz
   => `nr.00.tar.gz'
==> CWD /pub/biomirror/blast ... done.
==> PASV ... done.==> REST 240604000 ... done.
==> RETR nr.00.tar.gz ... done.
Length: 875,518,563 (835M), 634,914,563 (606M) remaining

36% [++==>] 315,859,6009.63K/s  ETA
7:26:45

14:13:53 (20.39 KB/s) - Control connection closed.
Retrying.

--14:13:54--
ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.00.tar.gz
  (try: 2) => `nr.00.tar.gz'
Connecting to bio-mirror.jp.apan.net|150.26.2.58|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.==> PWD ... done.
==> TYPE I ... done.  ==> CWD not required.
==> PASV ... done.==> REST 315859600 ... done.
==> RETR nr.00.tar.gz ...
No such file `nr.00.tar.gz'.

--14:13:59--
ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.01.tar.gz
   => `nr.01.tar.gz'
==> CWD /pub/biomirror/blast ... done.
==> PASV ... done.==> REST 270872000 ... done.
==> RETR nr.01.tar.gz ... done.
Length: 362,103,053 (345M), 91,231,053 (87M) remaining

91% [+++=>] 331,517,200   11.55K/s
ETA 30:23

15:14:15 (16.39 KB/s) - Control connection closed.
Retrying.

--15:14:16--
ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.01.tar.gz
  (try: 2) => `nr.01.tar.gz'
Connecting to bio-mirror.jp.apan.net|150.26.2.58|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.==> PWD ... done.
==> TYPE I ... done.  ==> CWD not required.
==> PASV ... done.==> REST 331517200 ... done.
==> RETR nr.01.tar.gz ...
No such file `nr.01.tar.gz'.

You can see that there are two files to be downloaded nr.00.tar.gz and
nr.01.tar.gz. Both downloads get interrupted and after the Control
connection is closed the reattempt fails to find the file. This is
consistent repeatable behaviour.

I have checked and the files are there and have not moved or altered in
any way.

I believe that the problem is almost certainly associated with the
logged item "CWD not required" after a reconnect.

Cheers
-- 
Paul Bickerstaff
Managing Director & Bioinformatics Consultant
Infoics Limited
PO Box 83153, Edmonton, Auckland 0652, New Zealand.
Mobile:   +64-21-390 266
Home/Fax: +64-9-837 8202
[EMAIL PROTECTED]  www.infoics.com



wget bug with following to a new location

2001-01-22 Thread Volker Kuhlmann

I came across this bug in wget where it gives an error instead of
following, as it should.

Volker


> wget --version
GNU Wget 1.5.3

Copyright (C) 1995, 1996, 1997, 1998 Free Software Foundation, Inc.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Written by Hrvoje Niksic <[EMAIL PROTECTED]>.

> cat ./Wns 
#!/bin/sh
#
exec wget -Y0 --proxy-pass=none --cache=off \
-U 'Mozilla/4.7 [en] (WinNT; I)' \
"$@"


> ./Wns http://www.themestream.com/articles/298752.html -S
--08:52:39--  http://www.themestream.com:80/articles/298752.html
   => `www.themestream.com/articles/298752.html'
Connecting to www.themestream.com:80... connected!
HTTP request sent, awaiting response... 302 Found
2 Connection: close
3 Date: Sun, 21 Jan 2001 19:52:05 GMT
4 Server: Apache/1.3.12 (Unix)  (Red Hat/Linux)
5 Cache-Control: no-cache
6 Content-Type: text/html; charset=ISO-8859-1
7 Expires: 0
8 Location: /gspd_browse/browse/view_article.gsp?c_id=298752&id_list=&cookied=T
9 Pragma: no-cache
10 Set-Cookie: g-id=pfkocnholppacgapgkll.10024743; expires=Fri, 01-Jan-2010 20:00:
00 GMT; path=/
11 Set-Cookie: g-id=fnfafclcgnjdmpaccfda.10024748; expires=Fri, 01-Jan-2010 20:00:
00 GMT; path=/
12 Content-Length: 0
13 
Location: /gspd_browse/browse/view_article.gsp?c_id=298752&id_list=&cookied=T [fol
lowing]
/gspd_browse/browse/view_article.gsp?c_id=298752&id_list=&cookied=T: Unknown/unsup
ported protocol.
Exit 1

> ./Wns http://www.themestream.com/'gspd_browse/browse/view_article.gsp?c_id=29875
2&id_list=&cookied=T' -S
--08:53:51--  http://www.themestream.com:80/gspd_browse/browse/view_article.gsp?c_
id=298752&id_list=&cookied=T
   => `www.themestream.com/gspd_browse/browse/view_article.gsp?c_id=298752
&id_list=&cookied=T'
Connecting to www.themestream.com:80... connected!
HTTP request sent, awaiting response... 200 OK
2 Connection: close
3 Date: Sun, 21 Jan 2001 19:54:03 GMT
4 Server: Apache/1.3.12 (Unix)  (Red Hat/Linux) PHP/3.0.15 mod_perl/1.21
5 Cache-Control: no-cache
6 Content-Type: text/html; charset=ISO-8859-1
7 Expires: 0
8 Pragma: no-cache
9 Set-Cookie: g-id=lacchjobhehcipblkmbg.10024818; expires=Fri, 01-Jan-2010 20:00:0
0 GMT; path=/
10 Content-Length: 21735
11 

0K -> .. .. .[100%]

Last-modified header missing -- time-stamps turned off.
08:53:56 (7.52 KB/s) - `www.themestream.com/gspd_browse/browse/view_article.gsp?c_
id=298752&id_list=&cookied=T' saved [21735/21735]



Wget bug: 32 bit int for "bytes downloaded".

2002-08-04 Thread Rogier Wolff


It seems wget uses a 32 bit integer for the "bytes downloaded":

[...]
FINISHED --17:11:26--
Downloaded: 1,047,520,341 bytes in 5830 files
cave /home/suse8.0# du -s
5230588 .
cave /home/suse8.0# 

As it's a "once per download" variable I'd say it's not that performance
critical...

Roger. 



Wget Bug: Re: not downloading everything with --mirror

2002-08-15 Thread Max Bowsher

Funk Gabor wrote:
>> HTTP does not provide a dirlist command, so wget parses html to find
>> other files it should download. Note: HTML not XML. I suspect that
>> is the problem.
>
> If wget wouldn't download the rest, I'd say that too. But 1st the dir
> gets created, the xml is dloaded (in some other directory some *.gif
> too) so wget "senses" the directory. If I issue the wget -m site/dir
> then all of the rest comes down, (index.html?D=A and others too) so
> wget is able to get everything but not at once. So there would be no
> technical limitation for wget to make it happen in one step. So it is
> either a missing feature (shall I say, a "bug" as wget can't do the
> mirror which it could've) or I was unable to find some switch which
> makes it happen at once.

Hmm, now I see. The vast majority of websites are configured to deny directory
viewing. That is probably why wget doesn't bother to try, except for the
directory specified as the root of the download. I don't think there is any
option to do this for all directories, because its not really needed. The _real_
bug is that wget is failing to parse what look like valid  tags. Perhaps someone more familiar with wget's html parsing code could
investigate? The command is: wget -r -l0 www.jeannette.hu/saj.htm and ignored
files are a number of image files.

Max.




Re: dificulty with Debian wget bug 137989 patch

2003-09-30 Thread Hrvoje Niksic
"jayme" <[EMAIL PROTECTED]> writes:
[...]

Before anything else, note that the patch originally written for 1.8.2
will need change for 1.9.  The change is not hard to make, but it's
still needed.

The patch didn't make it to canonical sources because it assumes `long
long', which is not available on many platforms that Wget supports.
The issue will likely be addressed in 1.10.

Having said that:

> I tried the patch Debian bug report 137989 and didnt work. Can
> anybody explain:
> 1 - why I have to make to directories for patch work: one
> wget-1.8.2.orig and one wget-1.8.2 ?

You don't.  Just enter Wget's source and type `patch -p1  2 - why after compilation the wget still cant download the file >
> 2GB ?

I suspect you've tried to apply the patch to Wget 1.9-beta, which
doesn't work, as explained above.



wget bug in retrieving large files > 2 gig

2004-03-09 Thread Eduard Boer
Hi,

While downloading a file of about 3,234,550,172 bytes with "wget 
http://foo/foo.mpg"; I get an error:

HTTP request sent, awaiting response... 200 OK
Length: unspecified [video/mpeg]
   [  
<=>   
] -1,060,417,124   13.10M/s

wget: retr.c:292: calc_rate: Assertion `bytes >= 0' failed.
Aborted
The md5sum of downloaded and origanal file is de same! So there should 
not be an error.
The amound of 'bytes downloaded' during is not correct also: It become 
negative over 2 gig.

greetings from the Netherlands,
Eduard



wget bug: spaces in directories mapped to %20

2005-01-16 Thread Tony O'Hagan
Recently I used the following wget command under a hosted linux account:
 $ wget -mirror  -o mirror.log
The web site contained files and virtual directories that contained spaces 
in the names.
URL encoding translated these spaces to %20.

wget correctly URL decoded the file names (creating file names containing 
spaces) but incorrectly failed to URL decode the directory names (creating 
directory paths containing %20 instead of spaces).  The resulting mirror 
therefor contained broken links.  Some hyper links were embedded inside 
flash graphics files so hyper link renaming was not an option.  Personally, 
I would never put a space in a web hosted file or directory name but in this 
case I was migrating a web site that had been developed by someone else.  I 
think that mirroring should work regardless in this case.

Example:
Original path:  abc def/xyz pqr.gif
After wget mirroring:   abc%20def/xyz pqr.gif   (broken link)
wget --version  is GNU Wget 1.8.2
Thanks for the invaluable wget.
Tony O'Hagan.

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.13 - Release Date: 16/01/2005


wget bug when using proxy, https, & digest authentication

2005-07-21 Thread Corey Wright
all patches are against wget 1.10.

please cc me on all responses as i am not subscribed to this list.

FIRST BUG

there is a bug in http.c.

when connecting by way of proxy & https, if digest authentication is
necessary, then the first connection attempt fails and we go to
retry_with_auth.  that much is as expected.  upon our second attempt,
unlike the first attempt, we don't set conn = proxy (as conn was most
recently set to u), so instead we try to connect directly to the host,
which fails (because the proxy must be used).

just as with the first attempt, we must set conn = proxy so that our
first connection is made to the proxy and not directly to the host.  but
setting conn = proxy occurs before retry_with_auth, so during our second
attempt conn = u because that is what it was last set to.  and as such
we try to connect to directly to the host, not the proxy as we should.

i believe the alternative (and more proper approach) is to delete these
lines:

  /* SOCK is now *really* connected to u->host, so update CONN
 to reflect this.  That way register_persistent will
 register SOCK as being connected to u->host:u->port.  */
  conn = u;

as we will never register a connection through a proxy (because we will
never request for it to be kept alive as per inhibit_keep_alive).  but i
just noticed this alternative and haven't had a chance to develop a
patch and test it.

SECOND BUG

there is another "bug" in http.c.  it is not "secure by default" as we
initially send all user names and passwords as basic authentication,
though digest authentication may be desired/needed.  i realize that
should basic authentication be all that is needed, removing the basic
authentication by default doubles the number of requests (first attempt
to learn we need basic authentication, second attempt to send basic
authentication), but i believe that is the necessary trade-off for
security.  should a user desire to use basic authentication by default,
i think an option should be added to allow it, but the default behavior
of wget should not unnecessarily compromise a user's name & password.

the second patch rectifies the problem, but a more complete patch would
include adding a command-line switch to always send basic authentication
by default.

THANKS

thanks to the people that maintain the mingw build configuration as it
was refreshing to be able to build my patched wget on windows using a
free software tool chain.  (having to build free software using a
non-free toolchain defeats the purpose.)

corey
-- 
[EMAIL PROTECTED]


proxy_after_auth_failure.patch
Description: Binary data


no_default_basic_auth.patch
Description: Binary data


[WGET BUG] - Can not retreive image from cacti

2006-06-19 Thread Thomas GRIMONET
Hello,

We are using version 1.10.2 of wget under Ubuntu and Debian. So we have 
many 
scripts that get some images from a cacti site. These scripts ran perfectly 
with version 1.9 of wget but they can not get image with version 1.10.2 of 
wget.

Here you can find an example of our scripts:

sub GetCactiGraph()
{
my ($node,$alt,$time,$filename)[EMAIL PROTECTED];

my $url = "https://foo.bar/cacti/";;
my $b = WWW::Mechanize->new();

$b->get($url);
$b->field("login_username", "user");
$b->field("login_password", "user");
$b->click();

if ($b->content() =~ /, gFld\(".*$node", "(.+)"\)\)/g)
{
$b->get($url . $1);
if ($b->content() =~ //g)
{
my $period = ($time eq "day" ? "&rra_id=1" : 
"&rra_id=3");
print "WGET: $url$1$period -O $filename\n";
if (defined $filename)
{ `wget -q "$url$1$period" -O "$filename"`; 
return $filename;}
else 
{ `wget --no-check-certificate -q 
"$url$1$period" -O "$alt.png"`; 
return "$alt.png" ;}
}
}
}

File is created but it is empty.


Bye,

Thomas


Re: wget bug in finding files after disconnect

2006-11-18 Thread Georg Schulte Althoff
Paul Bickerstaff <[EMAIL PROTECTED]> wrote in 
news:[EMAIL PROTECTED]:

> I'm using wget version "GNU Wget 1.10.2 (Red Hat modified)" on a fedora
> core5 x86_64 system (standard wget rpm). I'm also using version 1.10.2b
> on a WinXP laptop. Both display the same faulty behaviour which I don't
> believe was present in earlier versions of wget that I've used.
> 
> When the internet connection disconnects wget automatically tries to
> redownload the file (starting from where it was disconnected).
> 
> The problem is that it is consistently failing to find the file. The
> following output shows what is happening.
> 
> wget -c ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.*.tar.gz
[...]
> Retrying.
> 
> --14:13:54--
> ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.00.tar.gz
>   (try: 2) => `nr.00.tar.gz'
> Connecting to bio-mirror.jp.apan.net|150.26.2.58|:21... connected.
> Logging in as anonymous ... Logged in!
> ==> SYST ... done.==> PWD ... done.
> ==> TYPE I ... done.  ==> CWD not required.
> ==> PASV ... done.==> REST 315859600 ... done.
> ==> RETR nr.00.tar.gz ...
> No such file `nr.00.tar.gz'.
> 
[...]
> 
> I have checked and the files are there and have not moved or altered in
> any way.
> 
> I believe that the problem is almost certainly associated with the
> logged item "CWD not required" after a reconnect.
> 
> Cheers

I encountered the same situation and solved it this way:
Call wget with -B (--base) option to set base directory
and with -i (--input-file) to point to a file containing
the relative URLs you want to download.

Not tested, but it should look like this
  wget 
-c 
--base=ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/
--input-file=urls.txt
with urls.txt containing
  nr.*.tar.gz

Hope it helps you.

Georg



Re: wget bug with following to a new location

2001-01-23 Thread Hack Kampbjørn

Volker Kuhlmann wrote:
> 
> I came across this bug in wget where it gives an error instead of
> following, as it should.
> 
> Volker
> 
> > wget --version
> GNU Wget 1.5.3
Hmm that's quite old ...
> 
> Copyright (C) 1995, 1996, 1997, 1998 Free Software Foundation, Inc.
> This program is distributed in the hope that it will be useful,
> but WITHOUT ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> GNU General Public License for more details.
> 
> Written by Hrvoje Niksic <[EMAIL PROTECTED]>.
> 
> > cat ./Wns
> #!/bin/sh
> #
> exec wget -Y0 --proxy-pass=none --cache=off \
> -U 'Mozilla/4.7 [en] (WinNT; I)' \
> "$@"
-U that was in 1.5.3 release so you're using some patches ...

> 
> > ./Wns http://www.themestream.com/articles/298752.html -S
> --08:52:39--  http://www.themestream.com:80/articles/298752.html
>=> `www.themestream.com/articles/298752.html'
> Connecting to www.themestream.com:80... connected!
> HTTP request sent, awaiting response... 302 Found
> 2 Connection: close
> 3 Date: Sun, 21 Jan 2001 19:52:05 GMT
> 4 Server: Apache/1.3.12 (Unix)  (Red Hat/Linux)
> 5 Cache-Control: no-cache
> 6 Content-Type: text/html; charset=ISO-8859-1
> 7 Expires: 0
> 8 Location: /gspd_browse/browse/view_article.gsp?c_id=298752&id_list=&cookied=T

Aha no absolute location, this is a common "bug" in so many web-sites
8-(

> 9 Pragma: no-cache
> 10 Set-Cookie: g-id=pfkocnholppacgapgkll.10024743; expires=Fri, 01-Jan-2010 20:00:
> 00 GMT; path=/
> 11 Set-Cookie: g-id=fnfafclcgnjdmpaccfda.10024748; expires=Fri, 01-Jan-2010 20:00:
> 00 GMT; path=/
> 12 Content-Length: 0
> 13
> Location: /gspd_browse/browse/view_article.gsp?c_id=298752&id_list=&cookied=T [fol
> lowing]
> /gspd_browse/browse/view_article.gsp?c_id=298752&id_list=&cookied=T: Unknown/unsup
> ported protocol.
> Exit 1

Of course unknown protocol ...

Now seriosly: This "bug" has been fixed in release 1.6 look at the
web-site http://sunsite.dk/wget or on a GNU mirror near you (tm)

> 
> > ./Wns http://www.themestream.com/'gspd_browse/browse/view_article.gsp?c_id=29875
> 2&id_list=&cookied=T' -S
> --08:53:51--  http://www.themestream.com:80/gspd_browse/browse/view_article.gsp?c_
> id=298752&id_list=&cookied=T
>=> `www.themestream.com/gspd_browse/browse/view_article.gsp?c_id=298752
> &id_list=&cookied=T'
> Connecting to www.themestream.com:80... connected!
> HTTP request sent, awaiting response... 200 OK
> 2 Connection: close
> 3 Date: Sun, 21 Jan 2001 19:54:03 GMT
> 4 Server: Apache/1.3.12 (Unix)  (Red Hat/Linux) PHP/3.0.15 mod_perl/1.21
> 5 Cache-Control: no-cache
> 6 Content-Type: text/html; charset=ISO-8859-1
> 7 Expires: 0
> 8 Pragma: no-cache
> 9 Set-Cookie: g-id=lacchjobhehcipblkmbg.10024818; expires=Fri, 01-Jan-2010 20:00:0
> 0 GMT; path=/
> 10 Content-Length: 21735
> 11
> 
> 0K -> .. .. .[100%]
> 
> Last-modified header missing -- time-stamps turned off.
> 08:53:56 (7.52 KB/s) - `www.themestream.com/gspd_browse/browse/view_article.gsp?c_
> id=298752&id_list=&cookied=T' saved [21735/21735]

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: wget bug: spaces in directories mapped to %20

2005-01-17 Thread Jochen Roderburg
Zitat von Tony O'Hagan <[EMAIL PROTECTED]>:

> Original path:  abc def/xyz pqr.gif
> After wget mirroring:   abc%20def/xyz pqr.gif   (broken link)
>
> wget --version  is GNU Wget 1.8.2
>

This was a "well-known error" in the 1.8 versions of wget, which is already
corrected in the 1.9 versions.

Regards,

Jochen Roderburg
ZAIK/RRZK
University of Cologne
Robert-Koch-Str. 10 Tel.:   +49-221/478-7024
D-50931 Koeln   E-Mail: [EMAIL PROTECTED]
Germany



Re: [WGET BUG] - Can not retreive image from cacti

2006-06-19 Thread Steven M. Schweda
>From Thomas GRIMONET:

> [...]
> File is created but it is empty.

   That's normal with "-O" if Wget fails for some reason.

   It might help the diagnosis to see the actual Wget command instead of
the code which generates the Wget commsnd.  If that doesn't show you
anything, then adding "-d" to the command might help more.

   Normally, when Wget fails for some reason, it emits an error message. 
Where's yours?



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547


WGet Bug: Local URLs containing colons do not work

2006-12-10 Thread Peter Fletcher

Hi,

I am trying to download a Wiki category for off-line browsing,
and am using a command-line like this:

wget http://wiki/Category:Fish -r -l 1 -k

Wiki categories contain colons in their filenames, for example:

Category:Fish

If I request that wget convert absolute paths to relative links, then
it will create a link like this:

Fish

Unfortunately, this is not a valid URL, because the browser
interprets the 'Category:' as the invalid protocol
"Category", not the local filename 'Category:Fish'

You can get wget to replace the : with an escaped character
using --restrict-file-names=windows, but unfortunately this
does not fix the problem because the browser will un-escape
the URL and will still continue to look for a file with a colon
in it.

I am not sure of the best way to address this bug, because I
am not sure if it possible to escape the ':' to prevent the
browser from treating it as a delimiter.

It might be best to be allowed to specify some other character,
such as '_', to be used to replace the ':' in both filename and URL.

Regards,

Peter Fletcher



WGet Bug: Local URLs containing colons do not work

2006-12-10 Thread Peter Fletcher

Hi,

I am trying to download a Wiki category for off-line browsing,
and am using a command-line like this:

wget http://wiki/Category:Fish -r -l 1 -k

Wiki categories contain colons in their filenames, for example:

Category:Fish

If I request that wget convert absolute paths to relative links, then
it will create a link like this:

Fish

Unfortunately, this is not a valid URL, because the browser
interprets the 'Category:' as the protocol "Category", not
the local filename 'Category:'

I am not sure of the best way to address this bug, because I
am not sure if it possible to escape the ':' to prevent the
browser from treating it as a delimiter.

It might be best to be allowed to specify some character to be
used to replace the ':' in both filename and URL.

Regards,

Peter Fletcher


wget bug: mirror doesn't delete files deleted at the source

2003-07-31 Thread Mordechai T. Abzug

I'd like to use wget in mirror mode, but I notice that it doesn't
delete files that have been deleted at the source site.  Ie.:

  First run: the source site contains "foo" and "bar", so the mirror now
  contains "foo" and "bar".

  Before second run: the source site deletes "bar" and replaces it with
  "ook", and the mirror is run again.

  After second run: the mirror now contains "foo", "bar", and "ook".

This is not usually the way that mirrors work; wget should delete
"bar" if it's not at the site.

- Morty


Re: wget bug: mirror doesn't delete files deleted at the source

2003-08-01 Thread Aaron S. Hawley
On Fri, 1 Aug 2003, Mordechai T. Abzug wrote:

> I'd like to use wget in mirror mode, but I notice that it doesn't
> delete files that have been deleted at the source site.  Ie.:
>
>   First run: the source site contains "foo" and "bar", so the mirror now
>   contains "foo" and "bar".
>
>   Before second run: the source site deletes "bar" and replaces it with
>   "ook", and the mirror is run again.
>
>   After second run: the mirror now contains "foo", "bar", and "ook".
>
> This is not usually the way that mirrors work; wget should delete
> "bar" if it's not at the site.

i don't disagree on your definition of "mirrors", but in Unix (and in GNU)
its usually customary not to delete files without user permission.

http://www.google.com/search?q=wget+archives+delete+mirror+site%3Ageocrawler.com


new wget bug when doing incremental backup of very large site

2006-10-15 Thread dev
I was running wget to test mirroring an internal development site, and 
using large database dumps (binary format) as part of the content to 
provide me with a large number of binary files for the test.  For the 
test I wanted to see if wget would run and download a quantity of 500K 
files with 100GB of total data transferred.


The test was going fine and wget ran flawlessly for 3 days downloading 
almost the entire contents of the test site and I was at 85GB.  wget 
would have run until the very end and would have passed the test 
downloading all 100GB of the test files.


Then a power outage occurred, my local test box was not on battery 
backup, so I had to restart wget and the test.  wget did not refetch the 
binary backup files and gave (for each file that had already been 
retrieved the following message:


-
   => `/database/dbdump_107899.gz'
Connecting to ||:80... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

   The file is already fully retrieved; nothing to do.
-

---
wget continued to run for about eight hours, and gave the above message 
on several thousands files, then crashed giving:

   wget: realloc: Failed to allocate 536870912 bytes; memory exhausted.


This was surprising because wget ran flawlessly on the initial download 
for several days but on a "refresh" or incremental backup of the data, 
wget crashed after eight hours.   

I believe it has something to do with the code that is run when wget 
already finds a local file with the same name and sends a "range" 
request.  Maybe there is some data structure that keeps getting added to 
so it exhausts the memory on my test box which has 2GB.  There were no 
other programs running on the test box.


This may be a bug.  To get around this for purposes of my test, I would 
like to know if there is anyway (any switch) to tell wget to not send 
any type of range request at all, if the local filename exists but to 
skip sending any type of request, if it finds a file with the same 
name.  I do not want it to check to see if the file is newer, if the 
file is complete, just skip it and go on to the next file.



I was running wget under cygwin on a Windows XP box.

The wget command that I ran was the following:
   wget -m -l inf --convert-links --page-requisites http://

I had the following .wgetrc file
$HOME/.wgetrc
#backup_converted=on
page_requisites=on
continue=on
dirstruct=on
#mirror=on
#noclobber=on
#recursive=on
wait=3
http_user=
http_passwd=
#convert_links=on
verbose=on
user_agent=firefox
dot_style=binary


new wget bug when doing incremental backup of very large site

2006-10-15 Thread dev
I was running wget to test mirroring an internal development site, and 
using large database dumps (binary format) as part of the content to 
provide me with a large number of binary files for the test.  For the 
test I wanted to see if wget would run and download a quantity of 500K 
files with 100GB of total data transferred.


The test was going fine and wget ran flawlessly for 3 days downloading 
almost the entire contents of the test site and I was at 85GB.  wget 
would have run until the very end and would have passed the test 
downloading all 100GB of the test files.


Then a power outage occurred, my local test box was not on battery 
backup, so I had to restart wget and the test.  wget did not refetch the 
binary backup files and gave (for each file that had already been 
retrieved the following message:


-
  => `/database/dbdump_107899.gz'
Connecting to ||:80... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

  The file is already fully retrieved; nothing to do.
-

---
wget continued to run for about eight hours, and gave the above message 
on several thousands files, then crashed giving:

  wget: realloc: Failed to allocate 536870912 bytes; memory exhausted.


This was surprising because wget ran flawlessly on the initial download 
for several days but on a "refresh" or incremental backup of the data, 
wget crashed after eight hours.  
I believe it has something to do with the code that is run when wget 
already finds a local file with the same name and sends a "range" 
request.  Maybe there is some data structure that keeps getting added to 
so it exhausts the memory on my test box which has 2GB.  There were no 
other programs running on the test box.


This may be a bug.  To get around this for purposes of my test, I would 
like to know if there is anyway (any switch) to tell wget to not send 
any type of range request at all, if the local filename exists but to 
skip sending any type of request, if it finds a file with the same 
name.  I do not want it to check to see if the file is newer, if the 
file is complete, just skip it and go on to the next file.



I was running wget under cygwin on a Windows XP box.

The wget command that I ran was the following:
  wget -m -l inf --convert-links --page-requisites http://

I had the following .wgetrc file
$HOME/.wgetrc
#backup_converted=on
page_requisites=on
continue=on
dirstruct=on
#mirror=on
#noclobber=on
#recursive=on
wait=3
http_user=
http_passwd=
#convert_links=on
verbose=on
user_agent=firefox
dot_style=binary



new wget bug when doing incremental backup of very large site

2006-10-15 Thread dev
I was running wget to test mirroring an internal development site, and 
using large database dumps (binary format) as part of the content to 
provide me with a large number of binary files for the test.  For the 
test I wanted to see if wget would run and download a quantity of 500K 
files with 100GB of total data transferred.


The test was going fine and wget ran flawlessly for 3 days downloading 
almost the entire contents of the test site and I was at 85GB.  wget 
would have run until the very end and would have passed the test 
downloading all 100GB of the test files.


Then a power outage occurred, my local test box was not on battery 
backup, so I had to restart wget and the test.  wget did not refetch the 
binary backup files and gave (for each file that had already been 
retrieved the following message:


-
  => `/database/dbdump_107899.gz'
Connecting to ||:80... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

  The file is already fully retrieved; nothing to do.
-

---
wget continued to run for about eight hours, and gave the above message 
on several thousands files, then crashed giving:

  wget: realloc: Failed to allocate 536870912 bytes; memory exhausted.


This was surprising because wget ran flawlessly on the initial download 
for several days but on a "refresh" or incremental backup of the data, 
wget crashed after eight hours.  
I believe it has something to do with the code that is run when wget 
already finds a local file with the same name and sends a "range" 
request.  Maybe there is some data structure that keeps getting added to 
so it exhausts the memory on my test box which has 2GB.  There were no 
other programs running on the test box.


This may be a bug.  To get around this for purposes of my test, I would 
like to know if there is anyway (any switch) to tell wget to not send 
any type of range request at all, if the local filename exists but to 
skip sending any type of request, if it finds a file with the same 
name.  I do not want it to check to see if the file is newer, if the 
file is complete, just skip it and go on to the next file.



I was running wget under cygwin on a Windows XP box.

The wget command that I ran was the following:
  wget -m -l inf --convert-links --page-requisites http://

I had the following .wgetrc file
$HOME/.wgetrc
#backup_converted=on
page_requisites=on
continue=on
dirstruct=on
#mirror=on
#noclobber=on
#recursive=on
wait=3
http_user=
http_passwd=
#convert_links=on
verbose=on
user_agent=firefox
dot_style=binary



Re: new wget bug when doing incremental backup of very large site

2006-10-15 Thread Steven M. Schweda
   1. It would help to know the wget version ("wget -V").

   2. It might help to see some output when you add "-d" to the wget
command line.  (One existing file should be enough.)  It's not
immediately clear whose fault the 416 error is.  It might also help to
know which Web server is running on the server, and how big the file is
which you're trying to re-fetch.

> This was surprising [...]

   You're easily surprised.

> wget: realloc: Failed to allocate 536870912 bytes; memory exhausted.

   500MB sounds to me like a lot.

> [...] it exhausts the memory on my test box which has 2GB.

   A "memory exhausted" complaint here probably refers to virtual
memory, not physical memory.

> [...] I do not want it to check to see if the file is
> newer, if the file is complete, just skip it and go on to the next
> file.

   I haven't checked the code, but with "continue=on", I'd expect wget
to check the size and date together, and not download any real data if
the size checks, and the local file date is later.  The 416 error
suggests that it's trying to do a partial (byte-range) download, and is
failing because either it's sending a bad byte range, or the server is
misinterpreting a good byte range.  Adding "-d" should show what wget
thinks that it's sending.  Knowing that and the actual file size might
show a problem.

   If the "-d" output looks reasonable, the fault may lie with the
server, and an actual URL may be needed to persue the diagnosis from
there.

   The memory allocation failure could be a bug, but finding it could be
difficult.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547


Re: new wget bug when doing incremental backup of very large site

2006-10-18 Thread dev
I checked and the .wgetrc file has "continue=on".  Is there any way to 
surpress the sending of getting by byte range?  I will read through the 
email and see if I can gather some more information that may be needed.


thanks



Steven M. Schweda wrote:

   1. It would help to know the wget version ("wget -V").

   2. It might help to see some output when you add "-d" to the wget
command line.  (One existing file should be enough.)  It's not
immediately clear whose fault the 416 error is.  It might also help to
know which Web server is running on the server, and how big the file is
which you're trying to re-fetch.

  

This was surprising [...]



   You're easily surprised.

  

wget: realloc: Failed to allocate 536870912 bytes; memory exhausted.



   500MB sounds to me like a lot.

  

[...] it exhausts the memory on my test box which has 2GB.



   A "memory exhausted" complaint here probably refers to virtual
memory, not physical memory.

  

[...] I do not want it to check to see if the file is
newer, if the file is complete, just skip it and go on to the next
file.



   I haven't checked the code, but with "continue=on", I'd expect wget
to check the size and date together, and not download any real data if
the size checks, and the local file date is later.  The 416 error
suggests that it's trying to do a partial (byte-range) download, and is
failing because either it's sending a bad byte range, or the server is
misinterpreting a good byte range.  Adding "-d" should show what wget
thinks that it's sending.  Knowing that and the actual file size might
show a problem.

   If the "-d" output looks reasonable, the fault may lie with the
server, and an actual URL may be needed to persue the diagnosis from
there.

   The memory allocation failure could be a bug, but finding it could be
difficult.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547

  


Re: new wget bug when doing incremental backup of very large site

2006-10-21 Thread Steven M. Schweda
>From dev:

> I checked and the .wgetrc file has "continue=on". Is there any way to
> surpress the sending of getting by byte range? I will read through the
> email and see if I can gather some more information that may be needed.

   Remove "continue=on" from ".wgetrc"?

   Consider:

  -N,  --timestampingdon't re-retrieve files unless newer than
 local.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547


Wget Bug: recursive get from ftp with a port in the url fails

2006-04-12 Thread Jesse Cantara
I've encountered a bug when trying to do a recursive get from an ftp site with a non-standard port defined in the url, such as ftp.somesite.com:1234.An example of the command I am typing is:
"wget -r ftp://user:[EMAIL PROTECTED]:4321/Directory/*"Where "Directory" contains multiple subdirectories, all of which I wish to get.
The output I get from wget is:==> SYST ... done.    ==> PWD ... done.==> TYPE I ... done.  ==> CWD /Bis ... done.==> PASV ... done.    ==> LIST ... done.
ftp.somehost.com:4321/Directory: No such file or directoryftp.somehost.com:4321/Directory/.listing: No such file or directoryunlink: No such file or directory
And nothing is downloaded, wget stops executing there. A quick resolution to the problem is to use the "-nH" command line argument, so that wget doesn't attempt to create that particular directory. It appears as if the problem is with the creation of a directory with a ':' in the name, which I cannot do outside of wget either. I am not sure if that is specific to my filesystem, or to linux in general. 
I am using GNU Wget 1.10.2 in Linux version 2.6.14, Gentoo 3.3.6.Apologies if this is already known, or if I have not provided enough information. I looked for a bug listing, and attempted to get as much information as I can, but I am not a computer scientist or a programmer. 
Thank you very much for the wonderful program, it has helped me out in many ways, and I hope this helps the developers. -Jesse Cantara


Re: Wget Bug: recursive get from ftp with a port in the url fails

2006-04-13 Thread Hrvoje Niksic
"Jesse Cantara" <[EMAIL PROTECTED]> writes:

> A quick resolution to the problem is to use the "-nH" command line
> argument, so that wget doesn't attempt to create that particular
> directory. It appears as if the problem is with the creation of a
> directory with a ':' in the name, which I cannot do outside of wget
> either. I am not sure if that is specific to my filesystem, or to
> linux in general.

It's not specific to Linux, so it must be your file system.  Are you
perhaps running Wget on a FAT32-mounted partition?  If so, try using
--restrict-file-names=windows.

Thanks for the report.


[fwd] Wget Bug: recursive get from ftp with a port in the url fails

2007-09-17 Thread Hrvoje Niksic
--- Begin Message ---
Hi,I am using wget 1.10.2 in Windows 2003.And the same problem like Cantara.
The file system is NTFS.
Well I find my problem is, I wrote the command in schedule tasks like this:

wget  -N -i D:\virus.update\scripts\kavurl.txt -r -nH -P
d:\virus.update\kaspersky

well, after "wget",and before "-N", I typed TWO spaces.

After delete one space, wget works well again.

Hope this can help.

:)

-- 
from:baalchina
--- End Message ---


Re: [fwd] Wget Bug: recursive get from ftp with a port in the url fails

2007-09-17 Thread Micah Cowan
Hrvoje Niksic wrote:
> Subject:
> Re: Wget Bug: recursive get from ftp with a port in the url fails
> From:
> baalchina <[EMAIL PROTECTED]>
> Date:
> Mon, 17 Sep 2007 19:56:20 +0800
> To:
> [EMAIL PROTECTED]
> 
> To:
> [EMAIL PROTECTED]
> 
> Message-ID:
> <[EMAIL PROTECTED]>
> MIME-Version:
> 1.0
> Content-Type:
> multipart/alternative; boundary="==-=-="
> 
> 
> Hi,I am using wget 1.10.2 in Windows 2003.And the same problem like
> Cantara. The file system is NTFS.
> Well I find my problem is, I wrote the command in schedule tasks like this:
>  
> wget  -N -i D:\virus.update\scripts\kavurl.txt -r -nH -P
> d:\virus.update\kaspersky
>  
> well, after "wget",and before "-N", I typed TWO spaces.
>  
> After delete one space, wget works well again.
>  
> Hope this can help.
>  
> :)

Hi baalchina,

Hrvoje forwarded your message to the Wget discussion mailing list, where
such questions are really more appropriate, especially since Hrvoje is
not maintaining Wget any longer, but has left that responsibility for
others.

What you're describing does not appear to be a bug in Wget; it's the
shell's (or task scheduler's, or whatever) responsibility to split
space-separated elements properly; the words are supposed to already be
split apart (properly) by the time Wget sees it.

Also, you didn't really describe what was going wrong with Wget, or what
message about it's failure you were seeing (perhaps you'd need to
specify a log file with -o log, or via redirection of the command
interpreter supports it). However, if the problem is that Wget was
somehow seeing the space, as a separate argument or as part of another
one, then the bug lies with your task scheduler (or whatever is
interpreting the command line).

-- 
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



signature.asc
Description: OpenPGP digital signature