Re: wget-cvs-ifmodsince.patch
Ok, I have attached a new patch that moves the local time into http_stat. I am also sending this to [EMAIL PROTECTED] for others to try out. It seems to work great for me. wget-cvs-ifmodsince.patch ChangeLog: Craig Sowadski <[EMAIL PROTECTED]> * http.c (If-Modified-Since): Implemented use of 'If-Modified-Since' header instead of checking 'Last-Modified' durring the head-only request. Description: This patch modifies the time-stamping method by only comparing local and remote file sizes, and then using the 'If-Modified-Since' header durring the request. Craig Sowadski <[EMAIL PROTECTED]> From: Hrvoje Niksic <[EMAIL PROTECTED]> To: "Craig Sowadski" <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] Subject: Re: wget-cvs-ifmodsince.patch Date: Thu, 12 Feb 2004 19:01:06 +0100 The patch looks good, thanks. You might want to put the local time to `struct http_stat' (where other details lie), so that the number of arguments to gethttp doesn't multiply. Would you agree to post the patch to the list at <[EMAIL PROTECTED]>, so that other people can try it out? _ Get fast, reliable access with MSN 9 Dial-up. Click here for Special Offer! http://click.atdmt.com/AVE/go/onm00200361ave/direct/01/ wget-cvs-ifmodsince.patch Description: Binary data
Help: No such file or directory
I am a new user of Wget on Windows. The version I use is 1.9.1 running of Windows XP. I can download a file from a remote ftp server via Microsoft Internet Explorer with the command: ftp://user_name:[EMAIL PROTECTED]/full_file_path However, I want to make my file transfer process automatic and I select to use wget so that the process can be started from a script. The command I use : wget ftp://user_name:[EMAIL PROTECTED]/full_file_path The result is: Winsock error: 10060 failed: No such file or directory I tried also to add the following options: --proxy=off --passive-ftp --proxy=on --passsive-ftp --proxy=on --proxy=off The result is: failed: No such file or directory (the Winsock error does not appear) Can any one help? (I am not subscribed yet, please cc me when reply. Thanks a lot) Paul Kwok
Re: Socks proxy?
The SOCKS support was added to Wget at a very early date and was unmaintained for a long time, up to the point where it wouldn't build at all. Since I didn't have the SOCKS library installed and noone even reported the failures, I decided to remove the `--with-socks' option from configure until someone stepped up to add back the support. In the meantime, the SOCKS library itself changed and porting new applications to use it became much simpler than it used to be. If you have the time, visit http://www.socks.permeo.com/TechnicalResources/SOCKSFAQ/SOCKSGeneralFAQ/HowtoSocksifyClients.asp and see if the listed steps work with Wget. As far as I can tell, Wget is SOCKS-friendly, according to guidelines at http://www.socks.permeo.com/TechnicalResources/DevelopDocuments/SOCKSFReferenceImpl120C.asp
Socks proxy?
I'm not subscribed to the list, please CC: your replies to my mail address. I've been using Wget with an http proxy that doesn't support resuming (Proxy+), so I wanted to configure it for working with SOCKS but it doesn't seem to have that feature. I have a .wgetrc file on my HOME that says: http_proxy = http://10.0.0.3:1080/ ftp_proxy = http://10.0.0.3:1080/ (Socks proxy is waiting in the 1080 port). While it used to work when it was pointing to the HTTP proxy: http_proxy = http://10.0.0.3:4480/ ftp_proxy = http://10.0.0.3:4480/ Now it points to the socks one it fails at recognizing the headers when connected to the proxy. I also tried downloading the WgetPro sources and compiling those with --with-socks, but that didn't work that way either. As a matter of fact, if I do cd src; grep -i socks * I only get: config.h:/* Define if you wish to compile with socks support. */ config.h:#define HAVE_SOCKS 1 config.h.in:/* Define if you wish to compile with socks support. */ config.h.in:#undef HAVE_SOCKS Coincidencia en el fichero binario ftp-opie.o Coincidencia en el fichero binario wpro Which for me is crazy since it doesn't seem to use the HAVE_SOCKS variable at all, then I wonder how it could have socks support that way. I'm sure I'm doing something wrong, could anyone please tell me what? Thanks in advance. Best regards, -- H. HernĂ¡n Moraldo Moraldo Games http://games.moraldo.com.ar/
Re: Robots = off directive
patrick robinson <[EMAIL PROTECTED]> writes: >> That message has nothing to do with robots.txt, it means that you >> have rejected the file using the `-R' or equivalent option. > > Here you go again with this IMHO stupid implemented option. Why thank you. > I'm using it too but on some suffixes it acts after downloading by > deleting the already downloaded file and on other suffixes it works > in advance. It works /a posteriori/ on HTML documents because they need to be downloaded to be examined for links. Otherwise something like `wget -r -A jpg URL' would not download anything because the index item is not an image. > But I'm only using version 1.8.2 maybe it has been changed in more > recent versions by now. Many things have improved since 1.8.2. I recommend upgrade.
Re: Robots = off directive
Hello Hrvoje, On 16-Feb-04, you wrote: > "chatiman" <[EMAIL PROTECTED]> writes: >> I'm trying to download a robots.txt protexted directory and I'm having the >> following problem: >> >> - wget downloads the files but delete them after they are downloaded with >> the following :message (translated from french): >> Destroyed because it must be rejected > That message has nothing to do with robots.txt, it means that you have > rejected the file using the `-R' or equivalent option. Here you go again with this IMHO stupid implemented option. I'm using it too but on some suffixes it acts after downloading by deleting the already downloaded file and on other suffixes it works in advance. I wonder why it not always rejects predefined files/suffixes in advance. I doesn't make much sense to download them and then delete them. But I'm only using version 1.8.2 maybe it has been changed in more recent versions by now. Regards Patrick Robinson
Re: Startup delay on Windows
I'd be content with the following logic: Don't process a `system' wgetrc. If $HOME is not defined, use the directory the Wget executable is in as $HOME (what home_dir() returns). If $HOME/.wgetrc exists, use that; otherwise look for wget.ini in the directory the executable is in, regardless of $HOME. We would retain wget.ini support for backward compatibility, and support .wgetrc for consistency with other platforms and with the handling of .netrc. This would only break things if people had $HOME defined and it contained a .wgetrc and they expected the Windows port to ignore it. As a side-effect, this would also resolve the above issue. I went ahead and implemented this. I figure at least it will work as an interim solution. 2004-02-16 David Fritz <[EMAIL PROTECTED]> * init.c (home_dir): Use aprintf() instead of xmalloc()/sprintf(). Under Windows, if $HOME is not defined, use the directory that contains the Wget binary instead of hard-coded `C:\'. (wgetrc_file_name): Under Windows, look for $HOME/.wgetrc then, if not found, look for wget.ini in the directory of the Wget binary. * mswindows.c (ws_mypath): Employ slightly more robust methodology. Strip trailing path separator. Index: src/init.c === RCS file: /pack/anoncvs/wget/src/init.c,v retrieving revision 1.91 diff -u -r1.91 init.c --- src/init.c 2003/12/14 13:35:27 1.91 +++ src/init.c 2004/02/16 15:58:36 @@ -1,5 +1,5 @@ /* Reading/parsing the initialization file. - Copyright (C) 1995, 1996, 1997, 1998, 2000, 2001, 2003 + Copyright (C) 1995, 1996, 1997, 1998, 2000, 2001, 2003, 2004 Free Software Foundation, Inc. This file is part of GNU Wget. @@ -314,9 +314,9 @@ return NULL; home = pwd->pw_dir; #else /* WINDOWS */ - home = "C:\\"; - /* Maybe I should grab home_dir from registry, but the best -that I could get from there is user's Start menu. It sucks! */ + /* Under Windows, if $HOME isn't defined, use the directory where + `wget.exe' resides. */ + home = ws_mypath (); #endif /* WINDOWS */ } @@ -347,27 +347,24 @@ return xstrdup (env); } -#ifndef WINDOWS /* If that failed, try $HOME/.wgetrc. */ home = home_dir (); if (home) -{ - file = (char *)xmalloc (strlen (home) + 1 + strlen (".wgetrc") + 1); - sprintf (file, "%s/.wgetrc", home); -} +file = aprintf ("%s/.wgetrc", home); xfree_null (home); -#else /* WINDOWS */ - /* Under Windows, "home" is (for the purposes of this function) the - directory where `wget.exe' resides, and `wget.ini' will be used - as file name. SYSTEM_WGETRC should not be defined under WINDOWS. - - It is not as trivial as I assumed, because on 95 argv[0] is full - path, but on NT you get what you typed in command line. --dbudor */ - home = ws_mypath (); - if (home) + +#ifdef WINDOWS + /* Under Windows, if we still haven't found .wgetrc, look for the file + `wget.ini' in the directory where `wget.exe' resides; we do this for + backward compatibility with previous versions of Wget. + SYSTEM_WGETRC should not be defined under WINDOWS. */ + if (!file || !file_exists_p (file)) { - file = (char *)xmalloc (strlen (home) + strlen ("wget.ini") + 1); - sprintf (file, "%swget.ini", home); + xfree_null (file); + file = NULL; + home = ws_mypath (); + if (home) + file = aprintf ("%s/wget.ini", home); } #endif /* WINDOWS */ Index: src/mswindows.c === RCS file: /pack/anoncvs/wget/src/mswindows.c,v retrieving revision 1.22 diff -u -r1.22 mswindows.c --- src/mswindows.c 2003/11/03 21:57:03 1.22 +++ src/mswindows.c 2004/02/16 15:58:37 @@ -1,5 +1,5 @@ /* mswindows.c -- Windows-specific support - Copyright (C) 1995, 1996, 1997, 1998 Free Software Foundation, Inc. + Copyright (C) 1995, 1996, 1997, 1998, 2004 Free Software Foundation, Inc. This file is part of GNU Wget. @@ -199,22 +199,25 @@ ws_mypath (void) { static char *wspathsave = NULL; - char buffer[MAX_PATH]; - char *ptr; - if (wspathsave) + if (!wspathsave) { - return wspathsave; -} + char buf[MAX_PATH + 1]; + char *p; + DWORD len; + + len = GetModuleFileName (GetModuleHandle (NULL), buf, sizeof (buf)); + if (!len || (len >= sizeof (buf))) +return NULL; + + p = strrchr (buf, PATH_SEPARATOR); + if (!p) +return NULL; - if (GetModuleFileName (NULL, buffer, MAX_PATH) && - (ptr = strrchr (buffer, PATH_SEPARATOR)) != NULL) -{ - *(ptr + 1) = '\0'; - wspathsave = xstrdup (buffer); + *p = '\0'; + wspathsave = xstrdup (buf); } - else -wspathsave = NULL; + return wspathsave; }
Re: Robots = off directive
"chatiman" <[EMAIL PROTECTED]> writes: > I'm trying to download a robots.txt protexted directory and I'm having the > following problem: > > - wget downloads the files but delete them after they are downloaded with > the following :message (translated from french): > Destroyed because it must be rejected That message has nothing to do with robots.txt, it means that you have rejected the file using the `-R' or equivalent option.
Robots = off directive
Hello, I'm trying to download a robots.txt protexted directory and I'm having the following problem: - wget downloads the files but delete them after they are downloaded with the following :message (translated from french): Destroyed because it must be rejected How can I prevent this ? Thanks PS: I'm using wget 1.8.1-6
RE: delete-before switch
[resubmitted to wget@ instead of wget-patches] > From: Rupert Levene [mailto:[EMAIL PROTECTED] > .. > My vote: keep the option for either behaviour :-) As written, the > patch only changes behaviour if the --timestamping and --delete-before > options are in effect. > > Rupert I understand that you want that feature for your own special needs, on the other hand there is Hrvoje's (more than reasonable!) desire to avoid option proliferation and creeping featuritis. So why not a more general option - you could code a run-external-command feature before and after downloading a file, passing a number of arguments. Something like BEFORE [LOC=location, url] [SAVE_PATH=path where the file will be saved] [REF=possibly referring url] [ORG_SIZE=...] [STARTTIME=] ... then download, followed by AFTER SUCCESS|FAILURE [NUM_ATTEMPTS=..] [ERRTYPE=TIMEOUT|MAX_ATTEMPTS|NOT_RESOLVED] [FINAL_SIZE=] [USERTIME=...] [EFFECTIVETIME=usertime except the retry waiting periods] ... just as an example of syntax and parameters, probably somebody could come up with a better syntax, possibly some other interesting data could be gathered. Possibly the data could be passed in the environment instead of arguments (this would avoid the need for getopts or string operations for simple shell scripts). This would solve a whole lot of wanted features with just one option, for example from time to time somebody wants to know how to get an exact list of downloaded files, currently the log must be parsed or something similar. You would just write a small script in order to unlink the SAVE_PATH file and run wget --run-before=dounlink.pl or whatever. I suppose for a starter just basic data already available (url, path & filename, SUCCESS|FAILURE) would contain the amount of work needed for this. Hrvoje, what do you think about this ? Acceptable ? Horrible ? Heiko -- -- PREVINET S.p.A. www.previnet.it -- Heiko Herold [EMAIL PROTECTED] -- +39-041-5907073 ph -- +39-041-5907472 fax