Re: option changed: -nh - -nH
Hi Noèl! -nh and -nH are totally different. from wget 1.7.1 (I think the last version to offer both): `-nh' `--no-host-lookup' Disable the time-consuming DNS lookup of almost all hosts (*note Host Checking::). `-nH' `--no-host-directories' Disable generation of host-prefixed directories. By default, invoking Wget with `-r http://fly.srk.fer.hr/' will create a structure of directories beginning with `fly.srk.fer.hr/'. This option disables such behavior. For wget 1.8.x -nh became the default behavior. Switching back to host-look-up is not possible. I already complained that many old scripts now break and suggested that entering -nh at the command line would either be completely ignored or the user would be informed and wget executed nevertheless. Apparently this was not regarded as useful. CU Jens The option --no-host-directories changed from -nh to -nH (v1.8.1). Is there a reason for this? It breaks a lot of scripts when upgrading, I think. Could this be changed back to -nh? Thank you. -- Noèl Köthe -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
cuj.com file retrieving fails -why?
Hi! This problem is independent on whether a proxy is used or not: The download hangs, though I can read the content using konqueror. So what do cuj people do to inhibit automatic download and how can I circumvent it? wget --proxy=off -d -U Mozilla/5.0 (compatible; Konqueror/2.2.1; Linux) -r http://www.cuj.com/images/resource/experts/alexandr.gif DEBUG output created by Wget 1.7 on linux. parseurl (http://www.cuj.com/images/resource/experts/alexandr.gif;) - host www.cuj.com - opath images/resource/experts/alexandr.gif - dir images/resource/experts - file alexandr.gif - ndir images/resource/experts newpath: /images/resource/experts/alexandr.gif Checking for www.cuj.com in host_name_address_map. Checking for www.cuj.com in host_slave_master_map. First time I hear about www.cuj.com by that name; looking it up. Caching www.cuj.com - 66.35.216.85 Checking again for www.cuj.com in host_slave_master_map. --14:32:35-- http://www.cuj.com/images/resource/experts/alexandr.gif = `www.cuj.com/images/resource/experts/alexandr.gif' Connecting to www.cuj.com:80... Found www.cuj.com in host_name_address_map: 66.35.216.85 Created fd 3. connected! ---request begin--- GET /images/resource/experts/alexandr.gif HTTP/1.0 User-Agent: Mozilla/5.0 (compatible; Konqueror/2.2.1; Linux) Host: www.cuj.com Accept: */* Connection: Keep-Alive ---request end--- HTTP request sent, awaiting response... nothing happens Markus
Re: cuj.com file retrieving fails -why?
Hallo Markus! This is not a bug (I reckon) and should therefore have been sent to the normal wget list. Using both wget 1.7.1 and 1.8.1 on Windows the file is downloaded with wget -d -U Mozilla/5.0 (compatible; Konqueror/2.2.1; Linux) -r http://www.cuj.com/images/resource/experts/alexandr.gif as well as with wget http://www.cuj.com/images/resource/experts/alexandr.gif So, I do not know what your problem is, but is neither wget#s nor cuj's fault, AFAICT. CU Jens This problem is independent on whether a proxy is used or not: The download hangs, though I can read the content using konqueror. So what do cuj people do to inhibit automatic download and how can I circumvent it? wget --proxy=off -d -U Mozilla/5.0 (compatible; Konqueror/2.2.1; Linux) -r http://www.cuj.com/images/resource/experts/alexandr.gif DEBUG output created by Wget 1.7 on linux. parseurl (http://www.cuj.com/images/resource/experts/alexandr.gif;) - host www.cuj.com - opath images/resource/experts/alexandr.gif - dir images/resource/experts - file alexandr.gif - ndir images/resource/experts newpath: /images/resource/experts/alexandr.gif Checking for www.cuj.com in host_name_address_map. Checking for www.cuj.com in host_slave_master_map. First time I hear about www.cuj.com by that name; looking it up. Caching www.cuj.com - 66.35.216.85 Checking again for www.cuj.com in host_slave_master_map. --14:32:35-- http://www.cuj.com/images/resource/experts/alexandr.gif = `www.cuj.com/images/resource/experts/alexandr.gif' Connecting to www.cuj.com:80... Found www.cuj.com in host_name_address_map: 66.35.216.85 Created fd 3. connected! ---request begin--- GET /images/resource/experts/alexandr.gif HTTP/1.0 User-Agent: Mozilla/5.0 (compatible; Konqueror/2.2.1; Linux) Host: www.cuj.com Accept: */* Connection: Keep-Alive ---request end--- HTTP request sent, awaiting response... nothing happens Markus -- GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net
Re: cuj.com file retrieving fails -why?
On 3 Apr 2002 at 14:56, Markus Werle wrote: Jens Rösner wrote: So, I do not know what your problem is, but is neither wget#s nor cuj's fault, AFAICT. :-( I've just built Wget 1.7 on Linux and it seemed to download your problem file okay. So I don't know what your problem is either!
Re: Referrer Faking and other nifty features
On 2002-04-03 08:50 -0500, Dan Mahoney, System Admin wrote: 1) referrer faking (i.e., wget automatically supplies a referrer based on the, well, referring page) It is the --referer option, see (wget)HTTP Options, from the Info documentation. Yes, that allows me to specify _A_ referrer, like www.aol.com. When I'm trying to help my users mirror their old angelfire pages or something like that, very often the link has to come from the same directory. I'd like to see something where when wget follows a link to another page, or another image, it automatically supplies the URL of the page it followed to get there. Is there a way to do this? Somebody already asked for this and AFAICT, there's no way to do that. 3) Multi-threading. I suppose you mean downloading several URIs in parallel. No, wget doesn't support that. Sometimes, however, one may start several wget in parallel, thanks to the shell (the operator on Bourne shells). No, I mean downloading multiple files from the SAME uri in parallel, instead of downloading files one-by-one-by-one (thus saving time on a fast pipe). This doesn't make sense to me. When downloading from a single server, the bottleneck is generally either the server or the link ; in either case, there's nothing to win by attempting several simultaneous transfers. Unless there are several servers at the same IP and the bottleneck is the server, not the link ? -- André Majorel URL:http://www.teaser.fr/~amajorel/ std::disclaimer (Not speaking for my employer);
Re: cuj.com file retrieving fails -why?
On 3 Apr 2002 at 17:09, Markus Werle wrote: Ian Abbott wrote: On 3 Apr 2002 at 14:56, Markus Werle wrote: I've just built Wget 1.7 on Linux and it seemed to download your problem file okay. So I don't know what your problem is either! Ah! The kind of problem I like most! Did You have a special .wgetrc? Nothing special. $HOME/.wgetrc : robots = off system wgetrc : # Comments stripped out passive_ftp = on waitretry = 10
Re: Referrer Faking and other nifty features
Andre Majorel wrote: Yes, that allows me to specify _A_ referrer, like www.aol.com. When I'm trying to help my users mirror their old angelfire pages or something like that, very often the link has to come from the same directory. I'd like to see something where when wget follows a link to another page, or another image, it automatically supplies the URL of the page it followed to get there. Is there a way to do this? Somebody already asked for this and AFAICT, there's no way to do that Not only is it possible, it is the behavior (at least in wget 1.8.1). If you run with -d, you will see that every GET after the first one includes the appropriate referer. If I execute: wget -d -r http://www.exelana.com --referer=http://www.aol.com The first request is reported as: GET / HTTP/1.0 User-Agent: Wget/1.8.1 Host: www.exelana.com Accept: */* Connection: Keep-Alive Referer: http://www.aol.com But, the third request is: GET /left.html HTTP/1.0 User-Agent: Wget/1.8.1 Host: www.exelana.com Accept: */* Connection: Keep-Alive Referer: http://www.exelana.com/ The second request is for robots.txt and uses the referer from the command line. Tony
suspected bug in WGET 1.8.1
I'm using the NT port of WGET 1.8.1. FTP retrieval of files works fine, retrieval of directory listings fails. The problem happens under certain conditions when connecting to OS2 FTP servers. For example, if the current directory on the FTP server at login time is e:/abc, the command wget ftp://userid:password@ipaddr/g:\def\test.doc; works fine to retrieve the file, but the command wget ftp://userid:password@ipaddr/g:\def\; fails to retrieve the directory listing. For what it's worth, g:\def/, g:/def\ and g:/def/ also fail. Matt Jackson (919) 254-4547 [EMAIL PROTECTED]
wget 1.8: FTP recursion through proxy servers does not work anymore
Hi! in wget 1.8, FTP recursion through proxy servers does not work anymore: when i run wget --execute ftp_proxy = http://some.proxy:3128; -m ftp://some.host/dir/ wget only retrieves the file ftp://some.host/dir/index.html and stops. i found the following ChangeLog entries: * recur.c (recursive_retrieve): Enable FTP recursion through proxy servers. (at that point recussion started working), and * recur.c (url_queue_new): New function. (url_queue_delete): Ditto. (url_enqueue): Ditto. (url_dequeue): Ditto. (retrieve_tree): New function, replacement for recursive_retrieve. (descend_url_p): New function. (register_redirection): New function. at this point (when retrieve_tree replaced recursive_retrieve) it probably was broken. Best, v.
wget 1.8: FTP recursion through proxy servers does not work anymore
Hi! additional note: wget 1.7.1 works fine WRT FTP recursion through proxy servers; that was the last version before the change from recursive_retrieve to retrieve_tree was made (according to ChangeLog). So, indeed the change from recursive_retrieve to retrieve_tree introduced a bug. Best, v. == original message == in wget 1.8, FTP recursion through proxy servers does not work anymore: when i run wget --execute ftp_proxy = http://some.proxy:3128; -m ftp://some.host/dir/ wget only retrieves the file ftp://some.host/dir/index.html and stops. i found the following ChangeLog entries: * recur.c (recursive_retrieve): Enable FTP recursion through proxy servers. (at that point recussion started working), and * recur.c (url_queue_new): New function. (url_queue_delete): Ditto. (url_enqueue): Ditto. (url_dequeue): Ditto. (retrieve_tree): New function, replacement for recursive_retrieve. (descend_url_p): New function. (register_redirection): New function. at this point (when retrieve_tree replaced recursive_retrieve) it probably was broken. Best, v.
[±¤ °í]ÀÌ»ç°èȹÀÌ ÀÖÀ¸½Å °í°´À» À§ÇÑ ¸ÂÃ㼺ñ½º~!
Title: Untitled Document ±ÍÇÏÀÇ ½Â¶ô ¾øÀÌ È«º¸¼º ÀüÀÚ ¿ìÆíÀ» º¸³»°Ô µÈ Á¡ Á¤ÁßÈ÷ »ç°ú µå¸³´Ï´Ù. ¢º ÀúÈñ ȸ»ç´Â Á¤º¸Åë½Å¸ÁÀÌ¿ëÃËÁø¹ý ±ÔÁ¤À» ÁؼöÇÏ¿© ±¤°í ¸ÞÀÏÀÓÀ» Ç¥½ÃÇÏ¿´À¸¸ç, ¼ö½Å °ÅºÎ ÀåÄ¡¸¦ ÇÊÈ÷ ¸¶·ÃÇÏ°í ÀÖ½À´Ï´Ù. ¢º ±ÍÇÏÀÇ ÀüÀÚ ¿ìÆí ÁÖ¼Ò´Â ÀÎÅÍ³Ý »óÀÇ °ø°³µÈ Àå¼Ò¿¡¼ ½ÀµæÇÏ¿´À¸¸ç, ÀúÈñ ȸ»ç´Â ±ÍÇÏÀÇ ÀüÀÚ¿ìÆí ÁÖ¼Ò ¿Ü ¾î¶°ÇÑ °³ÀÎ Á¤º¸µµ °¡Áö°í ÀÖÁö ¾ÊÀ¸¹Ç·Î ¾È½ÉÇϽñ⠹ٶø´Ï´Ù. ¾ÕÀ¸·Î ÀúÈñ°¡ ¹ß¼ÛÇÏ´Â ¸ÞÀÏÀ» ¼ö½Å°ÅºÎÇÏ·Á¸é ¾Æ·¡¿¡ ¼ö½Å°ÅºÎ ¹öÆ°À» ´·¯ÁÖ¼¼¿ä. O »çÀü Çã¶ô¾øÀÌ ¸ÞÀÏÀ» º¸³»°Ô µÈÁ¡ »ç°úµå¸³´Ï´Ù.O ¸ÞÀÏ ¼ö½ÅÀ» ¿øÇϽÃÁö ¾ÊÀ¸½Ã¸é ´ÙÀ½À» Ŭ¸¯ÇØ ÁÖ¼¼¿ä