Re: [Bug-wget] Wget manpage missing options
From: Giuseppe Scrivano > Yes Sir, here we go! A freshly baked tarball: > > http://it.gnu.org/~gscrivano/files/wget-1.14.62-9d54.tar.bz2 Half-baked? Lost a few VMS-specific changes, but the first big problem was: while ((line = read_whole_line (fp)) != NULL) .^ %CC-I-IMPLICITFUNC, In this statement, the identifier "read_whole_line" is impli citly declared as a function. at line number 588 in file SYS$SYSDEVICE:[UTILITY.SOURCE.WGET.wget-1_14_62-9d54. src]init.c;2 I could declare it (as with secure_getenv()) in vms/config.h_vms if I knew how, but evidence is sparse: ALP $ search [-...]*.h, *.c read_whole_line ** SYS$SYSDEVICE:[UTILITY.SOURCE.WGET.wget-1_14_62-9d54.src]init.c;2 while ((line = read_whole_line (fp)) != NULL) ALP $ I assume that it should be in "lib/.c" (which would help at link time, too). SMS.
Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)
Tim Rühsen writes: > I saw it, but took the routine from 'Mget' (It is my code, so I can > contribute > to Wget). This was a matter of time I had and I knew that it works. > The idea is to define 'connect_timeout' as the time nothing happens while > connecting. > But please feel free to change it to work as in wgnutls_read_timeout(). OK, I will commit your patch for now and try to find some time to change it. > To explain my repeated indentation problems: > The IDE I am working with (Netbeans) doesn't allow project-based indentation > style. Since i always have several dozen project open, almost all of them > having 'Linux' style, I have to hand-remove the tabs and replace them by > spaces (in each line). Pretty awful, especially because the 'artifical > intelligence' screws in from time to time. I really can't write larger Gnu > code, just some fixes or hacks (though I would like to, but my poor > nerves...). > > Does Eclipse do it any better ? I leave the answer to someone who uses Eclipse :-) I use Emacs and I have the opposite problem, configure another style for non-GNU projects. For what it matters, you can just ignore indentation, I always run again trough the patches and check these sort of issues before push them. Sometimes I try to deal also with missing ChangeLog entries, but most of the time it is too painful :-) -- Giuseppe
Re: [Bug-wget] Wget manpage missing options
"Steven M. Schweda" writes: > From: Giuseppe Scrivano >However, after wasting some time looking for the "bool backups" > problem, I'd rather have a complete, consistent kit to work with. If > you could (when you get bored) create a "tar" kit, then I'd be willing > to try that. (Thanks.) Yes Sir, here we go! A freshly baked tarball: http://it.gnu.org/~gscrivano/files/wget-1.14.62-9d54.tar.bz2 2783de24b5f2a1fd0ab63c5f74c60fb98e79eca3 wget-1.14.62-9d54.tar.bz2 The git history looks like: $ git log -4 --oneline 9d5481e Make --backups work as documented af4e6fb doc: document --backups de5855d vms: support --backups 9e0d87a Download response body data for all requests It is not rebased on the current wget master branch. I will do that before push the changes, if you confirm they work for you too. Thanks! Giuseppe
Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)
Am Donnerstag, 11. Juli 2013 schrieb Giuseppe Scrivano: > Tim Rühsen writes: > > > diff --git a/src/gnutls.c b/src/gnutls.c > > index 54422fc..a3b4ecc 100644 > > --- a/src/gnutls.c > > +++ b/src/gnutls.c > >do > > { > >err = gnutls_handshake (session); > > - if (err < 0) > > + > > + if (opt.connect_timeout && err == GNUTLS_E_AGAIN) > > +{ > > + if (gnutls_record_get_direction (session)) > > +{ > > + /* wait for writeability */ > > + err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE); > > +} > > + else > > +{ > > + /* wait for readability */ > > + err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ); > > since this is in a loop, should we also decrement the time we wait for > at each iteration? We do something similar in wgnutls_read_timeout. I saw it, but took the routine from 'Mget' (It is my code, so I can contribute to Wget). This was a matter of time I had and I knew that it works. The idea is to define 'connect_timeout' as the time nothing happens while connecting. But please feel free to change it to work as in wgnutls_read_timeout(). BTW, maybe Wget should have something like Curls -m (a total/maximum timeout). I need such a thing in several projects (where I use Curl instead of Wget because of this reason). > I have fixed some indentation problems and also I had some troubles to > apply your patch with "git am" so I had to apply the changes > separately. Could you please use the version I have attached? I locally revert my commit and pull it in from master. To explain my repeated indentation problems: The IDE I am working with (Netbeans) doesn't allow project-based indentation style. Since i always have several dozen project open, almost all of them having 'Linux' style, I have to hand-remove the tabs and replace them by spaces (in each line). Pretty awful, especially because the 'artifical intelligence' screws in from time to time. I really can't write larger Gnu code, just some fixes or hacks (though I would like to, but my poor nerves...). Does Eclipse do it any better ? Regards, Tim signature.asc Description: This is a digitally signed message part.
Re: [Bug-wget] [PATCH] Add documentation for --regex-type and --preserve-permissions (and fix it)
Tomas Hozza writes: > From d5133f6e8f19fad36d711c8b592608cac2b92c53 Mon Sep 17 00:00:00 2001 > From: Tomas Hozza > Date: Thu, 11 Jul 2013 17:52:28 +0200 > Subject: [PATCH] Document missing options and fix --preserve-permissions Thanks for the patch. I have done some trivial changes before pushing it: I have removed trailing white spaces in the ChangeLog files and used two spaces at the end of sentences. -- Giuseppe
Re: [Bug-wget] [PATCH] Due to keep_alive handling login to protected site using http://username:password@server/ does not work
Tomas Hozza writes: > +2013-03-20 Tomas Hozza (tiny change) > + > + * http.c (gethttp): Set "sock" to -1 if it's not and we have no > + persistent connection > + > 2013-04-26 Tomas Hozza (tiny change) the patch looks OK, I have pushed it to master. I have removed the "(tiny change)" note in the ChangeLog as we resolved this issue separately. Thanks for the contribution. -- Giuseppe
Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)
Tim Rühsen writes: > diff --git a/src/gnutls.c b/src/gnutls.c > index 54422fc..a3b4ecc 100644 > --- a/src/gnutls.c > +++ b/src/gnutls.c >do > { >err = gnutls_handshake (session); > - if (err < 0) > + > + if (opt.connect_timeout && err == GNUTLS_E_AGAIN) > +{ > + if (gnutls_record_get_direction (session)) > +{ > + /* wait for writeability */ > + err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE); > +} > + else > +{ > + /* wait for readability */ > + err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ); since this is in a loop, should we also decrement the time we wait for at each iteration? We do something similar in wgnutls_read_timeout. I have fixed some indentation problems and also I had some troubles to apply your patch with "git am" so I had to apply the changes separately. Could you please use the version I have attached? >From 68a0ded101f7a5cc92014012254bb6f9d31738b9 Mon Sep 17 00:00:00 2001 From: Tim Ruehsen Date: Thu, 11 Jul 2013 14:29:20 +0200 Subject: [PATCH] gnutls: honor connect timeout --- src/ChangeLog | 4 src/gnutls.c | 60 ++- 2 files changed, 63 insertions(+), 1 deletion(-) diff --git a/src/ChangeLog b/src/ChangeLog index 5b978eb..efdc6b4 100644 --- a/src/ChangeLog +++ b/src/ChangeLog @@ -1,3 +1,7 @@ +2013-07-11 Tim Ruehsen + +* gnutls.c (ssl_connect_wget): respect connect timeout. + 2013-04-26 Tomas Hozza (tiny change) * log.c (redirect_output): Use DEFAULT_LOGFILE in diagnostic message diff --git a/src/gnutls.c b/src/gnutls.c index 54422fc..06f9020 100644 --- a/src/gnutls.c +++ b/src/gnutls.c @@ -374,6 +374,9 @@ static struct transport_implementation wgnutls_transport = bool ssl_connect_wget (int fd, const char *hostname) { +#ifdef F_GETFL + int flags = 0; +#endif struct wgnutls_transport_context *ctx; gnutls_session_t session; int err,alert; @@ -441,11 +444,54 @@ ssl_connect_wget (int fd, const char *hostname) return false; } + if (opt.connect_timeout) +{ +#ifdef F_GETFL + flags = fcntl (fd, F_GETFL, 0); + if (flags < 0) +return flags; + if (fcntl (fd, F_SETFL, flags | O_NONBLOCK)) +return -1; +#else + /* XXX: Assume it was blocking before. */ + const int one = 1; + if (ioctl (fd, FIONBIO, &one) < 0) +return -1; +#endif +} + /* We don't stop the handshake process for non-fatal errors */ do { err = gnutls_handshake (session); - if (err < 0) + + if (opt.connect_timeout && err == GNUTLS_E_AGAIN) +{ + if (gnutls_record_get_direction (session)) +{ + /* wait for writeability */ + err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE); +} + else +{ + /* wait for readability */ + err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ); +} + + if (err <= 0) +{ + if (err == 0) +{ + errno = ETIMEDOUT; + err = -1; +} + break; +} + + if (err <= 0) +break; +} + else if (err < 0) { logprintf (LOG_NOTQUIET, "GnuTLS: %s\n", gnutls_strerror (err)); if (err == GNUTLS_E_WARNING_ALERT_RECEIVED || @@ -461,6 +507,18 @@ ssl_connect_wget (int fd, const char *hostname) } while (err == GNUTLS_E_WARNING_ALERT_RECEIVED && gnutls_error_is_fatal (err) == 0); + if (opt.connect_timeout) +{ +#ifdef F_GETFL + if (fcntl (fd, F_SETFL, flags) < 0) +return -1; +#else + const int zero = 0; + if (ioctl (fd, FIONBIO, &zero) < 0) +return -1; +#endif +} + if (err < 0) { gnutls_deinit (session); -- 1.8.3.1 -- Giuseppe
Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)
Tomas Hozza writes: > From b565c9fcf37fb8d71b3c338f0ec8982295e283fe Mon Sep 17 00:00:00 2001 > From: Karsten Hopp > Date: Thu, 11 Jul 2013 11:27:35 +0200 > Subject: [PATCH] Fix timeout option when used with SSL > > Previously wget didn't honor the --timeout option if the remote host did > not answer SSL handshake > > Signed-off-by: Tomas Hozza > --- > src/openssl.c | 62 > ++- > 1 file changed, 53 insertions(+), 9 deletions(-) Thanks to have submitted the patch. It looks fine (I have just a minor comment below). Could I please ask you to provide also the entries for the ChangeLog file as part of your patch? It is a boring task but this is required by the GNU Coding standards[1]. If you have no time for this, I can do it. > diff --git a/src/openssl.c b/src/openssl.c > @@ -425,7 +461,14 @@ ssl_connect_wget (int fd, const char *hostname) >if (!SSL_set_fd (conn, FD_TO_SOCKET (fd))) > goto error; >SSL_set_connect_state (conn); > - if (SSL_connect (conn) <= 0 || conn->state != SSL_ST_OK) > + > + scwt_ctx.ssl = conn; > + if (run_with_timeout(opt.read_timeout, ssl_connect_with_timeout_callback, trailing whitespace. git am complained about it. -- Giuseppe 1) http://www.gnu.org/prep/standards/standards.html#Change-Logs
Re: [Bug-wget] Wget manpage missing options
From: Giuseppe Scrivano > I think the patches should apply without conflicts on 1.14. Could you > try to apply the patch I submitted yesterday and see if it works as > expected on VMS? If you want I can prepare a new tarball for you. I'd guess that, given that VMS-specific delete() change and code which uses "_" instead of "." in (all) the constructed names, if it works on a Unix(-like) system, then it should do about as well on VMS. But I could test it. However, after wasting some time looking for the "bool backups" problem, I'd rather have a complete, consistent kit to work with. If you could (when you get bored) create a "tar" kit, then I'd be willing to try that. (Thanks.) SMS.
Re: [Bug-wget] NTLM auth broken in 1.13.4
Am Mittwoch, 10. Juli 2013 schrieb Hrvoje Niksic: > The NTLM code kindly donated by Daniel has always required OpenSSL. > configure.ac says: > > dnl Enable NTLM if requested and if SSL is available. > if test x"$LIBSSL" != x || test "$ac_cv_lib_ssl32_SSL_connect" = yes > then > if test x"$ENABLE_NTLM" != xno > then > AC_DEFINE([ENABLE_NTLM], 1, > [Define if you want the NTLM authorization support compiled in.]) > AC_LIBOBJ([http-ntlm]) > fi > ... > > Updating the code to also support GNU/TLS appears straightforward. I just took a look at it. GnuTLS doesn't seem to support MD4 nor DES ECB mode. But since GnuTLS depends on libnettle, IMO that is the way to go... Any complaints ? Regards, Tim signature.asc Description: This is a digitally signed message part.
[Bug-wget] [PATCH] Add documentation for --regex-type and --preserve-permissions (and fix it)
Hi. I'm sending you patch that adds missing documentation for --regex-type and --preserve-permissions option, that are little bit documented only in wget usage (--help). The patch also fixes --preserve-permissions option, since it was not working as expected when downloading a single file via FTP (remote file permissions were not preserved). Regards, Tomas HozzaFrom d5133f6e8f19fad36d711c8b592608cac2b92c53 Mon Sep 17 00:00:00 2001 From: Tomas Hozza Date: Thu, 11 Jul 2013 17:52:28 +0200 Subject: [PATCH] Document missing options and fix --preserve-permissions Added documentation for --regex-type and --preserve-permissions options. Fixed --preserve-permissions to work properly also if downloading a single file from FTP. Signed-off-by: Tomas Hozza --- doc/ChangeLog | 4 doc/wget.texi | 9 + src/ChangeLog | 5 + src/ftp.c | 6 +++--- 4 files changed, 21 insertions(+), 3 deletions(-) diff --git a/doc/ChangeLog b/doc/ChangeLog index 1a70e3c..5e8fece 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,7 @@ +2013-07-11 Tomas Hozza + + * wget.texi: Document --regex-type and --preserve-permissions + 2013-06-17 Dave Reisner (tiny change) * texi2pod.pl: Fix formatting error that causes build to fail with diff --git a/doc/wget.texi b/doc/wget.texi index 710f0ac..285e4b7 100644 --- a/doc/wget.texi +++ b/doc/wget.texi @@ -1816,6 +1816,10 @@ in some rare firewall configurations, active FTP actually works when passive FTP doesn't. If you suspect this to be the case, use this option, or set @code{passive_ftp=off} in your init file. +@cindex file permissions +@item --preserve-permissions +Preserve remote file permissions instead of permissions set by umask. + @cindex symbolic links, retrieving @item --retr-symlinks Usually, when retrieving @sc{ftp} directories recursively and a symbolic @@ -2057,6 +2061,11 @@ it will be treated as a pattern, rather than a suffix. @itemx --reject-regex @var{urlregex} Specify a regular expression to accept or reject the complete URL. +@item --regex-type @var{regextype} +Specify the regular expression type. Possible types are @samp{posix} or +@samp{pcre}. Note that to be able to use @samp{pcre} type, wget has to be +compiled with libpcre support. + @item -D @var{domain-list} @itemx --domains=@var{domain-list} Set domains to be followed. @var{domain-list} is a comma-separated list diff --git a/src/ChangeLog b/src/ChangeLog index 5b978eb..20e10c2 100644 --- a/src/ChangeLog +++ b/src/ChangeLog @@ -1,3 +1,8 @@ +2013-07-11 Tomas Hozza + + * ftp.c (ftp_loop): Use ftp_retrieve_glob() also in case + --preserve-permissions was specified. + 2013-04-26 Tomas Hozza (tiny change) * log.c (redirect_output): Use DEFAULT_LOGFILE in diagnostic message diff --git a/src/ftp.c b/src/ftp.c index 9b3d81c..4954258 100644 --- a/src/ftp.c +++ b/src/ftp.c @@ -2285,11 +2285,11 @@ ftp_loop (struct url *u, char **local_file, int *dt, struct url *proxy, file_part = u->path; ispattern = has_wildcards_p (file_part); } - if (ispattern || recursive || opt.timestamping) + if (ispattern || recursive || opt.timestamping || opt.preserve_perm) { /* ftp_retrieve_glob is a catch-all function that gets called - if we need globbing, time-stamping or recursion. Its - third argument is just what we really need. */ + if we need globbing, time-stamping, recursion or preserve + permissions. Its third argument is just what we really need. */ res = ftp_retrieve_glob (u, &con, ispattern ? GLOB_GLOBALL : GLOB_GETONE); } -- 1.8.3.1
[Bug-wget] A possible wget bug?
Hi, I'm downloading and would like to exclude several paths from the download. I put the command in a script (enclosed below) but it keeps doing the download. Am I messing something up? Am I not understanding the syntax? My intent is to NOT download any of the i386, epel, macports, postgresql, or the ubuntu directories. I tried just coding i386,epel,macports,postgresql,ubuntu but that didn't work either. Any help you can give me in this would be wonderful. Regards, George... wget -r -l999 --no-parent mirror.pnl.gov -X "mirror.pnl.gov/fedora/linux/development/19/i386,mirror.pnl.gov/fedora/linux/development/rawhide/i386,mirror.pnl.gov/fedora/linux/releases/15/Everything/i386,mirror.pnl.gov/fedora/linux/releases/15/Fedora/i386,mirror.pnl.gov/fedora/linux/releases/16/Everything/i386,mirror.pnl.gov/fedora/linux/releases/16/Fedora/i386,mirror.pnl.gov/fedora/linux/releases/17/Everything/i386,mirror.pnl.gov/fedora/linux/releases/17/Fedora/i386,mirror.pnl.gov/fedora/linux/releases/18/Everything/i386,mirror.pnl.gov/fedora/linux/releases/18/Fedora/i386,mirror.pnl.gov/fedora/linux/releases/18/Live/i386,mirror.pnl.gov/fedora/linux/updates/15/i386,mirror.pnl.gov/fedora/linux/updates/16/i386,mirror.pnl.gov/fedora/linux/updates/17/i386,mirror.pnl.gov/fedora/linux/updates/18/i386,mirror.pnl.gov/fedora/linux/updates/19/i386,mirror.pnl.gov/epel,mirror.pnl.gov/macports,mirror.pnl.gov/postgresql,mirror.pnl.gov/releases,mirror.pnl.gov/ubuntu"
Re: [Bug-wget] Wget manpage missing options
Hi Giuseppe. - Original Message - > thanks to have tested it, indeed it was working in a different way. I > don't think this feature had any user before, considering how broken it > is. > > This patch should make it conformant to the documentation, if nobody > complains, I will push this series to master. > > From 9d5481e1f1f3f70f21e7f529f214da390bf6188a Mon Sep 17 00:00:00 2001 > From: Giuseppe Scrivano > Date: Wed, 10 Jul 2013 20:59:34 +0200 > Subject: [PATCH] Make --backups work as documented With the patch, the --backups option works as expected, however it does not apply cleanly on wget-1.14. But this is not a problem IMHO (it can be easily tweaked to apply). Thank you for documenting and fixing the option! Regards, Tomas Hozza
Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)
Am Donnerstag, 11. Juli 2013 schrieb Tomas Hozza: > Calling wget on https server with --timeout option does not work > when the server does not answer SSL handshake. Note that this has > been tested on wget-1.14 compiled with OpenSSL. Hi, here is the corresponding patch for GnuTLS. Regards, Tim From 5862c2e0e84838f40eda6332650bab10274bb211 Mon Sep 17 00:00:00 2001 From: Tim Ruehsen Date: Thu, 11 Jul 2013 14:29:20 +0200 Subject: [PATCH] add connect timeout to gnutls code --- src/ChangeLog | 6 ++ src/gnutls.c | 63 +-- 2 files changed, 67 insertions(+), 2 deletions(-) diff --git a/src/ChangeLog b/src/ChangeLog index 5b978eb..c39cfcb 100644 --- a/src/ChangeLog +++ b/src/ChangeLog @@ -1,3 +1,9 @@ +2013-07-11 Tim Ruehsen + +* gnutls.c (ssl_connect_wget): respect connect timeout + 2013-04-26 Tomas Hozza (tiny change) * log.c (redirect_output): Use DEFAULT_LOGFILE in diagnostic message diff --git a/src/gnutls.c b/src/gnutls.c index 54422fc..a3b4ecc 100644 --- a/src/gnutls.c +++ b/src/gnutls.c @@ -374,6 +374,9 @@ static struct transport_implementation wgnutls_transport = bool ssl_connect_wget (int fd, const char *hostname) { +#ifdef F_GETFL + int flags = 0; +#endif struct wgnutls_transport_context *ctx; gnutls_session_t session; int err,alert; @@ -441,11 +444,55 @@ ssl_connect_wget (int fd, const char *hostname) return false; } + if (opt.connect_timeout) +{ +#ifdef F_GETFL + flags = fcntl (fd, F_GETFL, 0); + if (flags < 0) +return flags; + if (fcntl (fd, F_SETFL, flags | O_NONBLOCK)) +return -1; +#else + /* XXX: Assume it was blocking before. */ + const int one = 1; + if (ioctl (fd, FIONBIO, &one) < 0) +return -1; +#endif +} + /* We don't stop the handshake process for non-fatal errors */ do { err = gnutls_handshake (session); - if (err < 0) + + if (opt.connect_timeout && err == GNUTLS_E_AGAIN) +{ + if (gnutls_record_get_direction (session)) +{ + /* wait for writeability */ + err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE); +} + else +{ + /* wait for readability */ + err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ); +} + + if (err <= 0) +{ + if (err == 0) +{ + errno = ETIMEDOUT; + err = -1; +} + + break; +} + + if (err <= 0) + break; +} + else if (err < 0) { logprintf (LOG_NOTQUIET, "GnuTLS: %s\n", gnutls_strerror (err)); if (err == GNUTLS_E_WARNING_ALERT_RECEIVED || @@ -461,6 +508,18 @@ ssl_connect_wget (int fd, const char *hostname) } while (err == GNUTLS_E_WARNING_ALERT_RECEIVED && gnutls_error_is_fatal (err) == 0); + if (opt.connect_timeout) +{ +#ifdef F_GETFL + if (fcntl (fd, F_SETFL, flags) < 0) +return -1; +#else + const int zero = 0; + if (ioctl (fd, FIONBIO, &zero) < 0) +return -1; +#endif +} + if (err < 0) { gnutls_deinit (session); @@ -468,7 +527,7 @@ ssl_connect_wget (int fd, const char *hostname) } ctx = xnew0 (struct wgnutls_transport_context); - ctx->session = session; + ctx->session = session; fd_register_transport (fd, &wgnutls_transport, ctx); return true; } -- 1.8.3.2 signature.asc Description: This is a digitally signed message part.
[Bug-wget] [PATCH] Due to keep_alive handling login to protected site using http://username:password@server/ does not work
Hello. I already proposed this bug in Savannah (https://savannah.gnu.org/bugs/?38554) and will copy the discussion we already had there to answer some questions. - BEGINNING Wed 20 Mar 2013 11:52:09 AM GMT, original submission: (Tomas Hozza) This problem exists for example when trying to download something from Boa web server with Basic authentication. Problem description from Red Hat Bugzilla [1] (on Fedora 18 with wget-1.13.4 AND wget-1.14) The problem is that boa server answers to the initial GET packet with two 401 Unauthorised packets (at least in my case). From your dumps I see that boa is answering with only one packet, but it is somehow broken. Although wget seems to interpret it correctly it will close the connection without setting opened socket variable to -1. Then it fails trying to write to that already closed socket. This repeats for default maximum retries times. Reason why wget-1.12 works is how it handle connections. In wget-1.12 the connection established for GET request is closed before authentication and the socket variable is set to -1. So when wget tries to authenticate to the server, it establishes new connection and everything works. Reason why --user and --password options works is that those credentials are used as default for site-wide authentication. What wget does in this case is pretty much the same as when you pass credentials in the URL with one difference. When wget receives 401 Unauthorised from server and have those "default" credentials set AND have NOT credentials in URL, then it adds the host address to a list of hosts which request authentication. After that it also closes the initial connection, then fails to write new GET request with authentication credentials. BUT when it tries for the second time it will find out that the host address is in list of hosts which request authentication and therefore it will send GET request including authentication credentials. So that's why it works. I think the proper fix for this would be to set the socket variable to -1 in case of a problem when authenticating and closing the socket. This will make sure that wget will not try to write data to already closed socket but rather it will create new connection. [1] https://bugzilla.redhat.com/show_bug.cgi?id=912358 --- Wed 20 Mar 2013 11:56:06 AM GMT, comment #1: (Tomas Hozza) I think Bug https://savannah.gnu.org/bugs/?34141 may be connected to this one. Sorry for the new Bug BUT in 34141 was no activity what so ever so hopefully now it will be better. --- Thu 04 Apr 2013 05:08:30 PM GMT, comment #2: (Giuseppe Scrivano) wouldn't that leak the socket if it is not valid? Should you perhaps use CLOSE_INVALIDATE? Can you also add the entry to the src/ChangeLog file? --- Fri 05 Apr 2013 07:42:13 AM GMT, comment #3: (Tomas Hozza) Thank you for your response. > wouldn't that leak the socket if it is not valid? I scanned wget-1.14 without the patch and with the patch using static analysis tool Coverity 6.5.1. It didn't find any added errors, so there should be NO resource leak. I know static analysis is not perfect, but from my experiences with Coverity I must say it is very reliable. Anyway there is the following construction: if (persistent_available_p(...)) { ... } else if (host_lookup_failed) { ... } else if (sock != -1) { sock = -1; } persistent_available_p(...) returns false only if there is NO active connection OR if we want to use new connection (SSL) or if there was some problem with connection and it was invalidated. In other words if persistent_available_p(...) doesn't returned true, then we have to create new connection anyway. > Should you perhaps use CLOSE_INVALIDATE? Problem is that persistent_available_p(...) deals with "pconn.socket" while outside of this function we use "sock" variable. Even if we used CLOSE_INVALIDATE() in persistent_available_p() we have to set "sock" to -1 because we might be using already closed socket. Also calling CLOSE_INVALIDATE() on "sock" if persistent_available_p() returned false would be not a good idea, because we might be closing already closed socket. > Can you also add the entry to the src/ChangeLog file? I attached new patch with entry in src/ChangeLog. END --- So if anyone could have a look at the patch, review it and possibly merge it into the git, it would be great. If you have any questions regarding the patch, please let me know. Regards, Tomas HozzaFrom 7e1c434415711ec11ab05f50fd6a898d16eb133e Mon Sep 17 00:00:00 2001 From: Tomas Hozza Date: Thu, 11 Jul 2013 13:22:43 +0200 Subject: [PATCH] Set sock variable to -1 if no persistent conn exists Wget should set sock variable to -1 if no persistent connection exists. Function persistent_available_p() tests persistent connect
[Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)
Hi. Calling wget on https server with --timeout option does not work when the server does not answer SSL handshake. Note that this has been tested on wget-1.14 compiled with OpenSSL. This issue can be reproduced as follows: - on first terminal run: # nc -l localhost 12345 - on second terminal run: wget --timeout=2 --no-check-certificate https://localhost:12345 Without the attached patch wget does not exit after specified timeout. Regards, Tomas HozzaFrom b565c9fcf37fb8d71b3c338f0ec8982295e283fe Mon Sep 17 00:00:00 2001 From: Karsten Hopp Date: Thu, 11 Jul 2013 11:27:35 +0200 Subject: [PATCH] Fix timeout option when used with SSL Previously wget didn't honor the --timeout option if the remote host did not answer SSL handshake Signed-off-by: Tomas Hozza --- src/openssl.c | 62 ++- 1 file changed, 53 insertions(+), 9 deletions(-) diff --git a/src/openssl.c b/src/openssl.c index 3924e41..189b334 100644 --- a/src/openssl.c +++ b/src/openssl.c @@ -256,19 +256,42 @@ struct openssl_transport_context { char *last_error; /* last error printed with openssl_errstr */ }; -static int -openssl_read (int fd, char *buf, int bufsize, void *arg) -{ - int ret; - struct openssl_transport_context *ctx = arg; +struct openssl_read_args { + int fd; + struct openssl_transport_context *ctx; + char *buf; + int bufsize; + int retval; +}; + +static void openssl_read_callback(void *arg) { + struct openssl_read_args *args = (struct openssl_read_args *) arg; + struct openssl_transport_context *ctx = args->ctx; SSL *conn = ctx->conn; + char *buf = args->buf; + int bufsize = args->bufsize; + int ret; + do ret = SSL_read (conn, buf, bufsize); - while (ret == -1 - && SSL_get_error (conn, ret) == SSL_ERROR_SYSCALL + while (ret == -1 && SSL_get_error (conn, ret) == SSL_ERROR_SYSCALL && errno == EINTR); + args->retval = ret; +} - return ret; +static int +openssl_read (int fd, char *buf, int bufsize, void *arg) +{ + struct openssl_read_args args; + args.fd = fd; + args.buf = buf; + args.bufsize = bufsize; + args.ctx = (struct openssl_transport_context*) arg; + + if (run_with_timeout(opt.read_timeout, openssl_read_callback, &args)) { +return -1; + } + return args.retval; } static int @@ -386,6 +409,18 @@ static struct transport_implementation openssl_transport = { openssl_peek, openssl_errstr, openssl_close }; +struct scwt_context { + SSL *ssl; + int result; +}; + +static void +ssl_connect_with_timeout_callback(void *arg) +{ + struct scwt_context *ctx = (struct scwt_context *)arg; + ctx->result = SSL_connect(ctx->ssl); +} + /* Perform the SSL handshake on file descriptor FD, which is assumed to be connected to an SSL server. The SSL handle provided by OpenSSL is registered with the file descriptor FD using @@ -398,6 +433,7 @@ bool ssl_connect_wget (int fd, const char *hostname) { SSL *conn; + struct scwt_context scwt_ctx; struct openssl_transport_context *ctx; DEBUGP (("Initiating SSL handshake.\n")); @@ -425,7 +461,14 @@ ssl_connect_wget (int fd, const char *hostname) if (!SSL_set_fd (conn, FD_TO_SOCKET (fd))) goto error; SSL_set_connect_state (conn); - if (SSL_connect (conn) <= 0 || conn->state != SSL_ST_OK) + + scwt_ctx.ssl = conn; + if (run_with_timeout(opt.read_timeout, ssl_connect_with_timeout_callback, + &scwt_ctx)) { +DEBUGP (("SSL handshake timed out.\n")); +goto timeout; + } + if (scwt_ctx.result <= 0 || conn->state != SSL_ST_OK) goto error; ctx = xnew0 (struct openssl_transport_context); @@ -441,6 +484,7 @@ ssl_connect_wget (int fd, const char *hostname) error: DEBUGP (("SSL handshake failed.\n")); print_errors (); + timeout: if (conn) SSL_free (conn); return false; -- 1.8.3.1
Re: [Bug-wget] Listing fails with specific ftp-Server
Am 09.07.2013 23:51, schrieb Giuseppe Scrivano: Simon Winiger writes: Hello, i have the problem that wget cannot download from a specific ftp-Server, because it can't get a listing (debug code below (user+pw X-ed), produced html file contains no info). I searched the web&this list, but found only some info from 2001 and wget 1.6 (?) Maybe someone can give a hint? Thanks for the report, can you run wget with --no-remove-listing and attach the .listing file? Hello Guiseppe, thanks for your support. I did this, the .listing-file contains only 1 line: "z:\-a not found" (The ftp-Server works with windows-ftp-clients like cyberduck or Winscp) I wrote to the admin of the ftp server, he answered: the ftp-server used can't cope with regular expressions, and of course supplies no index.htm(l). He now has setup a mini-http-server which works with wget. (So the problem is solved for me personally) -- Simon Winiger, M.Sc. Division Thermal Systems and Buildings Fraunhofer Institut für Solare Energiesysteme ISE Heidenhofstrasse 2, 79110 Freiburg, Germany Tel: +49 (0)761/ 4588-5129 Email: simon.wini...@ise.fraunhofer.de http://www.ise.fraunhofer.de
Re: [Bug-wget] Wget manpage missing options
"Steven M. Schweda" writes: >I also assumed that the patch was based on something newer than the > original 1.14 source kit I was using, so, even if I had noticed a > problem, I probably would not have been suspicious. For example, around > here, nothing worked right until I made this change: I think the patches should apply without conflicts on 1.14. Could you try to apply the patch I submitted yesterday and see if it works as expected on VMS? If you want I can prepare a new tarball for you. -- Giuseppe