Re: Failing assertion in Wget 2187

2006-08-28 Thread Mauro Tortonesi

Stefan Melbinger ha scritto:
By the way, as you might have noticed I wanted to exchange the real 
domain names with example.com, but forgot to exchange the last argument. :)


 >>   --domains='www.example.com,a.example.com,b.example.com'
 >>   --user-agent='Example'
 >>   --output-file='example.log'
 >>   'www.euroskop.cz'

So, just for the record, the real --domains value was 
'www.euroskop.cz,www2.euroskop.cz,rozcestnik.euroskop.cz'.


thanks.


In this case, that doesn't change the output, tho.


right.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: Failing assertion in Wget 2187

2006-08-28 Thread Stefan Melbinger
By the way, as you might have noticed I wanted to exchange the real 
domain names with example.com, but forgot to exchange the last argument. :)


>>   --domains='www.example.com,a.example.com,b.example.com'
>>   --user-agent='Example'
>>   --output-file='example.log'
>>   'www.euroskop.cz'

So, just for the record, the real --domains value was 
'www.euroskop.cz,www2.euroskop.cz,rozcestnik.euroskop.cz'.


In this case, that doesn't change the output, tho.

Have a nice day,
Stefan

Am 28.08.2006 16:44, Mauro Tortonesi schrieb:

Stefan Melbinger ha scritto:

Hello everyone,

I'm having troubles with the newest trunk version of wget (revision 
2187).


Command-line arguments:

wget
  --recursive
  --spider
  --no-parent
  --no-directories
  --follow-ftp
  --retr-symlinks
  --no-verbose
  --level='2'
  --span-hosts
  --domains='www.example.com,a.example.com,b.example.com'
  --user-agent='Example'
  --output-file='example.log'
  'www.euroskop.cz'

Results in:

wget: url.c:1934: getchar_from_escaped_string: Assertion `str && *str' 
failed.

Aborted

Can somebody reproduce this problem? Am I using illegal combinations 
of arguments? Any ideas?


(Worked before the newest patch.)


it's really weird. with this command:

wget -d --verbose --recursive --spider --no-parent --no-directories 
--follow-ftp --retr-symlinks --level='2' --span-hosts 
--user-agent='Mozilla/5.001 (windows; U; NT4.0; en-us) Gecko/25250101' 
--domains='www.example.com,a.example.com,b.example.com' 
http://www.euroskop.cz/


i get:

---response begin---
HTTP/1.0 200 OK
Date: Mon, 28 Aug 2006 14:35:14 GMT
Content-Type: text/html
Expires: Mon, 28 Aug 2006 14:35:14 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, 
pre-check=0

Server: Apache/1.3.26 (Unix) Debian GNU/Linux CSacek/2.1.9 PHP/4.1.2
X-Powered-By: PHP/4.1.2
Pragma: no-cache
Set-Cookie: PHPSESSID=b8af8e220f5f1f7321b86ce0524f88b2; expires=Tue, 
29-Aug-06 14:35:14 GMT; path=/

Via: 1.1 proxy (NetCache NetApp/5.6.2R1)

---response end---
200 OK

Stored cookie www.euroskop.cz -1 (ANY) /   [expiry 
2006-08-29 16:35:14] PHPSESSID b8af8e220f5f1f7321b86ce0524f88b2

Length: unspecified [text/html]
Closed fd 3
200 OK

index.html: No such file or directory

FINISHED --16:37:42--
Downloaded: 0 bytes in 0 files


it seems there is a weird interaction between cookies and the recursive 
spider algorithm that makes wget bail out. i'll have to investigate this.




PS: Just FYI, when I compile I get the following warnings:

http.c: In function `http_loop':
http.c:2425: warning: implicit declaration of function `nonexisting_url'

main.c: In function `main':
main.c:1009: warning: implicit declaration of function 
`print_broken_links'


recur.c: In function `retrieve_tree':
recur.c:279: warning: implicit declaration of function `visited_url'


fixed, thanks.





Re: DNS through proxy with wget

2006-08-28 Thread Mauro Tortonesi

Karr, David ha scritto:

Inside our firewall, we can't do simple DNS lookups for hostnames
outside of our firewall.  However, I can write a Java program that uses
commons-httpclient, specifying the proxy credentials, and my URL
referencing an external host name will connect to that host perfectly
fine, obviously resolving the DNS name under the covers.

If I then use wget to do a similar request, even if I specify the proxy
credentials, it fails to find the host.  If I instead plug in the IP
address instead of the hostname, it works fine.

I noticed that the command-line options for wget allow me to specify the
proxy user and password, but they don't have a way to specify the proxy
host and port.


right. you have to specify the hostname/IP address and port of your 
proxy in your .wgetrc, or by means of the -e option:


wget -e 'http_proxy = http://yourproxy:8080/' --proxy-user=user 
--proxy-password=password -Y on http://someurl.com


--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget 1.11 beta 1 released

2006-08-28 Thread Mauro Tortonesi

Christopher G. Lewis ha scritto:

I've updated the Windows binaries to include Beta 1, and included a
binary with Beta 1 + today's patch 2186 & 2187 for spider recursive
mode.

Available here: http://www.ChristopherLewis.com\wget 


thank you very much, chris. you're doing an awesome work.


And sorry to those who have been having some problems downloading the
ZIPs from my site.  I had some weird IIS gzip compression issues.


we should plan to move the win32 binaries page to wget.sunsite.dk 
immediately after the 1.11 release. what do you think?


--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: Failing assertion in Wget 2187

2006-08-28 Thread Mauro Tortonesi

Stefan Melbinger ha scritto:

Hello everyone,

I'm having troubles with the newest trunk version of wget (revision 2187).

Command-line arguments:

wget
  --recursive
  --spider
  --no-parent
  --no-directories
  --follow-ftp
  --retr-symlinks
  --no-verbose
  --level='2'
  --span-hosts
  --domains='www.example.com,a.example.com,b.example.com'
  --user-agent='Example'
  --output-file='example.log'
  'www.euroskop.cz'

Results in:

wget: url.c:1934: getchar_from_escaped_string: Assertion `str && *str' 
failed.

Aborted

Can somebody reproduce this problem? Am I using illegal combinations of 
arguments? Any ideas?


(Worked before the newest patch.)


it's really weird. with this command:

wget -d --verbose --recursive --spider --no-parent --no-directories 
--follow-ftp --retr-symlinks --level='2' --span-hosts 
--user-agent='Mozilla/5.001 (windows; U; NT4.0; en-us) Gecko/25250101' 
--domains='www.example.com,a.example.com,b.example.com' 
http://www.euroskop.cz/


i get:

---response begin---
HTTP/1.0 200 OK
Date: Mon, 28 Aug 2006 14:35:14 GMT
Content-Type: text/html
Expires: Mon, 28 Aug 2006 14:35:14 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, 
pre-check=0

Server: Apache/1.3.26 (Unix) Debian GNU/Linux CSacek/2.1.9 PHP/4.1.2
X-Powered-By: PHP/4.1.2
Pragma: no-cache
Set-Cookie: PHPSESSID=b8af8e220f5f1f7321b86ce0524f88b2; expires=Tue, 
29-Aug-06 14:35:14 GMT; path=/

Via: 1.1 proxy (NetCache NetApp/5.6.2R1)

---response end---
200 OK

Stored cookie www.euroskop.cz -1 (ANY) /   [expiry 
2006-08-29 16:35:14] PHPSESSID b8af8e220f5f1f7321b86ce0524f88b2

Length: unspecified [text/html]
Closed fd 3
200 OK

index.html: No such file or directory

FINISHED --16:37:42--
Downloaded: 0 bytes in 0 files


it seems there is a weird interaction between cookies and the recursive 
spider algorithm that makes wget bail out. i'll have to investigate this.




PS: Just FYI, when I compile I get the following warnings:

http.c: In function `http_loop':
http.c:2425: warning: implicit declaration of function `nonexisting_url'

main.c: In function `main':
main.c:1009: warning: implicit declaration of function `print_broken_links'

recur.c: In function `retrieve_tree':
recur.c:279: warning: implicit declaration of function `visited_url'


fixed, thanks.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't workwith HTTP]

2006-08-28 Thread Mauro Tortonesi

Jochen Roderburg wrote:


I have now tested the new wget 1.11 beta1 on my Linux system and the above issue
is solved now. The "Remote file is newer" message now only appears when the
local file exists and most of the other logic with time-stamping and
file-naming works like expected.


excellent.


I meanwhile found, however, another new problem with time-stamping, which mainly
occurs in connection with a proxy-cache, I will report that in a new thread.
Same for a small problem with the SSL configuration.


thank you very much for the useful bug reports you keep sending us ;-)

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget man page - line missing

2006-08-28 Thread Mauro Tortonesi

tropikhajma ha scritto:

after line 665 there seems to be one line missing in the wget.1 man page

664 .Sp
665 For more information about the use of proxies with Wget, 
666 .IP "\fB\-Q\fR \fIquota\fR" 4

667 .IX Item "-Q quota"


this is already fixed in 1.11-beta-1. thank you very much for you report 
anyway.


--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget 1.11 beta 1 released

2006-08-28 Thread Mauro Tortonesi

Noèl Köthe ha scritto:

Am Dienstag, den 22.08.2006, 17:00 +0200 schrieb Mauro Tortonesi:

Hello, Mauro,


i've just released wget 1.11 beta 1:


Thanks.:)


you're very welcome to try it and report every bug you might encounter.


...
/usr/bin/make install 
DESTDIR=/home/nk/debian/wget/wget-experimental/wget-1.10.2+1.11.beta1/debian/wget
make[1]: Entering directory 
`/home/nk/debian/wget/wget-experimental/wget-1.10.2+1.11.beta1'
cd src && /usr/bin/make CC='gcc' CPPFLAGS='' DEFS='-DHAVE_CONFIG_H 
-DSYSTEM_WGETRC=\"/etc/wgetrc\" -DLOCALEDIR=\"/usr/share/locale\"' 
CFLAGS='-D_FILE_OFFSET_BITS=64 -g -Wall' LDFLAGS='' LIBS='-ldl -lrt  -lssl -lcrypto ' DESTDIR='' 
prefix='/usr' exec_prefix='/usr' bindir='/usr/bin' infodir='/usr/share/info' mandir='/usr/share/man' 
manext='1' install.bin
make[2]: Entering directory 
`/home/nk/debian/wget/wget-experimental/wget-1.10.2+1.11.beta1/src'
../mkinstalldirs /usr/bin
/usr/bin/install -c wget /usr/bin/wget
...

I set DESTDIR in line 1 to install it somewhere but in line 3 DESTDIR=''

The problem should be fixed by this:

--- Makefile.in.orig2006-08-25 19:53:41.0 +0200
+++ Makefile.in 2006-08-25 19:53:55.0 +0200
@@ -77,7 +77,7 @@
 # flags passed to recursive makes in subdirectories
 MAKEDEFS = CC='$(CC)' CPPFLAGS='$(CPPFLAGS)' DEFS='$(DEFS)' \
 CFLAGS='$(CFLAGS)' LDFLAGS='$(LDFLAGS)' LIBS='$(LIBS)' \
-DESTDIR='$(DESTDIR=)' prefix='$(prefix)' exec_prefix='$(exec_prefix)' \
+DESTDIR='$(DESTDIR)' prefix='$(prefix)' exec_prefix='$(exec_prefix)' \
 bindir='$(bindir)' infodir='$(infodir)' mandir='$(mandir)' \
 manext='$(manext)'


Fixed, thanks.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it