Re: [Bug-wget] Wget manpage missing options

2013-07-11 Thread Steven M. Schweda
From: Giuseppe Scrivano 

> Yes Sir, here we go!  A freshly baked tarball:
> 
> http://it.gnu.org/~gscrivano/files/wget-1.14.62-9d54.tar.bz2

   Half-baked?  Lost a few VMS-specific changes, but the first big
problem was:

  while ((line = read_whole_line (fp)) != NULL)
.^
%CC-I-IMPLICITFUNC, In this statement, the identifier "read_whole_line" is impli
citly declared as a function.
at line number 588 in file SYS$SYSDEVICE:[UTILITY.SOURCE.WGET.wget-1_14_62-9d54.
src]init.c;2

   I could declare it (as with secure_getenv()) in vms/config.h_vms if I
knew how, but evidence is sparse:

ALP $ search [-...]*.h, *.c read_whole_line

**
SYS$SYSDEVICE:[UTILITY.SOURCE.WGET.wget-1_14_62-9d54.src]init.c;2

  while ((line = read_whole_line (fp)) != NULL)
ALP $ 

   I assume that it should be in "lib/.c" (which would help
at link time, too).

   SMS.



Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)

2013-07-11 Thread Giuseppe Scrivano
Tim Rühsen  writes:

> I saw it, but took the routine from 'Mget' (It is my code, so I can 
> contribute 
> to Wget). This was a matter of time I had and I knew that it works.
> The idea is to define 'connect_timeout' as the time nothing happens while 
> connecting.
> But please feel free to change it to work as in wgnutls_read_timeout().

OK, I will commit your patch for now and try to find some time to change it.


> To explain my repeated indentation problems:
> The IDE I am working with (Netbeans) doesn't allow project-based indentation 
> style. Since i always have several dozen project open, almost all of them 
> having 'Linux' style, I have to hand-remove the tabs and replace them by 
> spaces (in each line). Pretty awful, especially because the 'artifical 
> intelligence' screws in from time to time. I really can't write larger Gnu 
> code, just some fixes or hacks (though I would like to, but my poor 
> nerves...).
>
> Does Eclipse do it any better ?

I leave the answer to someone who uses Eclipse :-)  I use Emacs and I
have the opposite problem, configure another style for non-GNU projects.

For what it matters, you can just ignore indentation, I always run again
trough the patches and check these sort of issues before push them.

Sometimes I try to deal also with missing ChangeLog entries, but most of
the time it is too painful :-)

-- 
Giuseppe



Re: [Bug-wget] Wget manpage missing options

2013-07-11 Thread Giuseppe Scrivano
"Steven M. Schweda"  writes:

> From: Giuseppe Scrivano 

>However, after wasting some time looking for the "bool backups"
> problem, I'd rather have a complete, consistent kit to work with.  If
> you could (when you get bored) create a "tar" kit, then I'd be willing
> to try that.  (Thanks.)

Yes Sir, here we go!  A freshly baked tarball:

http://it.gnu.org/~gscrivano/files/wget-1.14.62-9d54.tar.bz2

2783de24b5f2a1fd0ab63c5f74c60fb98e79eca3  wget-1.14.62-9d54.tar.bz2

The git history looks like:

$ git log -4 --oneline
9d5481e Make --backups work as documented
af4e6fb doc: document --backups
de5855d vms: support --backups
9e0d87a Download response body data for all requests

It is not rebased on the current wget master branch.  I will do that
before push the changes, if you confirm they work for you too.

Thanks!
Giuseppe



Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)

2013-07-11 Thread Tim Rühsen
Am Donnerstag, 11. Juli 2013 schrieb Giuseppe Scrivano:
> Tim Rühsen  writes:
> 
> > diff --git a/src/gnutls.c b/src/gnutls.c
> > index 54422fc..a3b4ecc 100644
> > --- a/src/gnutls.c
> > +++ b/src/gnutls.c
> >do
> >  {
> >err = gnutls_handshake (session);
> > -  if (err < 0)
> > +
> > +  if (opt.connect_timeout && err == GNUTLS_E_AGAIN)
> > +{
> > +  if (gnutls_record_get_direction (session))
> > +{
> > +  /* wait for writeability */
> > +  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE);
> > +}
> > +  else
> > +{
> > +  /* wait for readability */
> > +  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ);
> 
> since this is in a loop, should we also decrement the time we wait for
> at each iteration?  We do something similar in wgnutls_read_timeout.

I saw it, but took the routine from 'Mget' (It is my code, so I can contribute 
to Wget). This was a matter of time I had and I knew that it works.
The idea is to define 'connect_timeout' as the time nothing happens while 
connecting.
But please feel free to change it to work as in wgnutls_read_timeout().

BTW, maybe Wget should have something like Curls -m (a total/maximum timeout). 
I need such a thing in several projects (where I use Curl instead of Wget 
because of this reason).


> I have fixed some indentation problems and also I had some troubles to
> apply your patch with "git am" so I had to apply the changes
> separately.  Could you please use the version I have attached?

I locally revert my commit and pull it in from master.

To explain my repeated indentation problems:
The IDE I am working with (Netbeans) doesn't allow project-based indentation 
style. Since i always have several dozen project open, almost all of them 
having 'Linux' style, I have to hand-remove the tabs and replace them by 
spaces (in each line). Pretty awful, especially because the 'artifical 
intelligence' screws in from time to time. I really can't write larger Gnu 
code, just some fixes or hacks (though I would like to, but my poor nerves...).

Does Eclipse do it any better ?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH] Add documentation for --regex-type and --preserve-permissions (and fix it)

2013-07-11 Thread Giuseppe Scrivano
Tomas Hozza  writes:

> From d5133f6e8f19fad36d711c8b592608cac2b92c53 Mon Sep 17 00:00:00 2001
> From: Tomas Hozza 
> Date: Thu, 11 Jul 2013 17:52:28 +0200
> Subject: [PATCH] Document missing options and fix --preserve-permissions

Thanks for the patch.

I have done some trivial changes before pushing it: I have removed
trailing white spaces in the ChangeLog files and used two spaces at the
end of sentences.

-- 
Giuseppe



Re: [Bug-wget] [PATCH] Due to keep_alive handling login to protected site using http://username:password@server/ does not work

2013-07-11 Thread Giuseppe Scrivano
Tomas Hozza  writes:

> +2013-03-20  Tomas Hozza   (tiny change)
> +
> + * http.c (gethttp): Set "sock" to -1 if it's not and we have no
> + persistent connection
> +
>  2013-04-26  Tomas Hozza   (tiny change)

the patch looks OK, I have pushed it to master.

I have removed the "(tiny change)" note in the ChangeLog as we resolved
this issue separately.

Thanks for the contribution.

-- 
Giuseppe



Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)

2013-07-11 Thread Giuseppe Scrivano
Tim Rühsen  writes:

> diff --git a/src/gnutls.c b/src/gnutls.c
> index 54422fc..a3b4ecc 100644
> --- a/src/gnutls.c
> +++ b/src/gnutls.c
>do
>  {
>err = gnutls_handshake (session);
> -  if (err < 0)
> +
> +  if (opt.connect_timeout && err == GNUTLS_E_AGAIN)
> +{
> +  if (gnutls_record_get_direction (session))
> +{
> +  /* wait for writeability */
> +  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE);
> +}
> +  else
> +{
> +  /* wait for readability */
> +  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ);

since this is in a loop, should we also decrement the time we wait for
at each iteration?  We do something similar in wgnutls_read_timeout.

I have fixed some indentation problems and also I had some troubles to
apply your patch with "git am" so I had to apply the changes
separately.  Could you please use the version I have attached?

>From 68a0ded101f7a5cc92014012254bb6f9d31738b9 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen 
Date: Thu, 11 Jul 2013 14:29:20 +0200
Subject: [PATCH] gnutls: honor connect timeout

---
 src/ChangeLog |  4 
 src/gnutls.c  | 60 ++-
 2 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 5b978eb..efdc6b4 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,7 @@
+2013-07-11  Tim Ruehsen  
+
+* gnutls.c (ssl_connect_wget): respect connect timeout.
+
 2013-04-26  Tomas Hozza   (tiny change)
 
 	* log.c (redirect_output): Use DEFAULT_LOGFILE in diagnostic message
diff --git a/src/gnutls.c b/src/gnutls.c
index 54422fc..06f9020 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -374,6 +374,9 @@ static struct transport_implementation wgnutls_transport =
 bool
 ssl_connect_wget (int fd, const char *hostname)
 {
+#ifdef F_GETFL
+  int flags = 0;
+#endif
   struct wgnutls_transport_context *ctx;
   gnutls_session_t session;
   int err,alert;
@@ -441,11 +444,54 @@ ssl_connect_wget (int fd, const char *hostname)
   return false;
 }
 
+  if (opt.connect_timeout)
+{
+#ifdef F_GETFL
+  flags = fcntl (fd, F_GETFL, 0);
+  if (flags < 0)
+return flags;
+  if (fcntl (fd, F_SETFL, flags | O_NONBLOCK))
+return -1;
+#else
+  /* XXX: Assume it was blocking before.  */
+  const int one = 1;
+  if (ioctl (fd, FIONBIO, &one) < 0)
+return -1;
+#endif
+}
+
   /* We don't stop the handshake process for non-fatal errors */
   do
 {
   err = gnutls_handshake (session);
-  if (err < 0)
+
+  if (opt.connect_timeout && err == GNUTLS_E_AGAIN)
+{
+  if (gnutls_record_get_direction (session))
+{
+  /* wait for writeability */
+  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE);
+}
+  else
+{
+  /* wait for readability */
+  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ);
+}
+
+  if (err <= 0)
+{
+  if (err == 0)
+{
+  errno = ETIMEDOUT;
+  err = -1;
+}
+  break;
+}
+
+  if (err <= 0)
+break;
+}
+  else if (err < 0)
 {
   logprintf (LOG_NOTQUIET, "GnuTLS: %s\n", gnutls_strerror (err));
   if (err == GNUTLS_E_WARNING_ALERT_RECEIVED ||
@@ -461,6 +507,18 @@ ssl_connect_wget (int fd, const char *hostname)
 }
   while (err == GNUTLS_E_WARNING_ALERT_RECEIVED && gnutls_error_is_fatal (err) == 0);
 
+  if (opt.connect_timeout)
+{
+#ifdef F_GETFL
+  if (fcntl (fd, F_SETFL, flags) < 0)
+return -1;
+#else
+  const int zero = 0;
+  if (ioctl (fd, FIONBIO, &zero) < 0)
+return -1;
+#endif
+}
+
   if (err < 0)
 {
   gnutls_deinit (session);
-- 
1.8.3.1


-- 
Giuseppe


Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)

2013-07-11 Thread Giuseppe Scrivano
Tomas Hozza  writes:

> From b565c9fcf37fb8d71b3c338f0ec8982295e283fe Mon Sep 17 00:00:00 2001
> From: Karsten Hopp 
> Date: Thu, 11 Jul 2013 11:27:35 +0200
> Subject: [PATCH] Fix timeout option when used with SSL
>
> Previously wget didn't honor the --timeout option if the remote host did
> not answer SSL handshake
>
> Signed-off-by: Tomas Hozza 
> ---
>  src/openssl.c | 62 
> ++-
>  1 file changed, 53 insertions(+), 9 deletions(-)

Thanks to have submitted the patch.  It looks fine (I have just a minor
comment below).  Could I please ask you to provide also the entries for
the ChangeLog file as part of your patch?  It is a boring task but this
is required by the GNU Coding standards[1].  If you have no time for
this, I can do it.

> diff --git a/src/openssl.c b/src/openssl.c
> @@ -425,7 +461,14 @@ ssl_connect_wget (int fd, const char *hostname)
>if (!SSL_set_fd (conn, FD_TO_SOCKET (fd)))
>  goto error;
>SSL_set_connect_state (conn);
> -  if (SSL_connect (conn) <= 0 || conn->state != SSL_ST_OK)
> +
> +  scwt_ctx.ssl = conn;
> +  if (run_with_timeout(opt.read_timeout, ssl_connect_with_timeout_callback, 

trailing whitespace.  git am complained about it.

-- 
Giuseppe

1) http://www.gnu.org/prep/standards/standards.html#Change-Logs



Re: [Bug-wget] Wget manpage missing options

2013-07-11 Thread Steven M. Schweda
From: Giuseppe Scrivano 

> I think the patches should apply without conflicts on 1.14.  Could you
> try to apply the patch I submitted yesterday and see if it works as
> expected on VMS?  If you want I can prepare a new tarball for you.

   I'd guess that, given that VMS-specific delete() change and code
which uses "_" instead of "." in (all) the constructed names, if it
works on a Unix(-like) system, then it should do about as well on VMS. 
But I could test it.

   However, after wasting some time looking for the "bool backups"
problem, I'd rather have a complete, consistent kit to work with.  If
you could (when you get bored) create a "tar" kit, then I'd be willing
to try that.  (Thanks.)

   SMS.



Re: [Bug-wget] NTLM auth broken in 1.13.4

2013-07-11 Thread Tim Rühsen
Am Mittwoch, 10. Juli 2013 schrieb Hrvoje Niksic:
> The NTLM code kindly donated by Daniel has always required OpenSSL.
> configure.ac says:
> 
> dnl Enable NTLM if requested and if SSL is available.
> if test x"$LIBSSL" != x || test "$ac_cv_lib_ssl32_SSL_connect" = yes
> then
>   if test x"$ENABLE_NTLM" != xno
>   then
> AC_DEFINE([ENABLE_NTLM], 1,
>  [Define if you want the NTLM authorization support compiled in.])
> AC_LIBOBJ([http-ntlm])
>   fi
> ...
> 
> Updating the code to also support GNU/TLS appears straightforward.

I just took a look at it.
GnuTLS doesn't seem to support MD4 nor DES ECB mode.

But since GnuTLS depends on libnettle, IMO that is the way to go...

Any complaints ?

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


[Bug-wget] [PATCH] Add documentation for --regex-type and --preserve-permissions (and fix it)

2013-07-11 Thread Tomas Hozza
Hi.

I'm sending you patch that adds missing documentation for --regex-type
and --preserve-permissions option, that are little bit documented only
in wget usage (--help).

The patch also fixes --preserve-permissions option, since it was not
working as expected when downloading a single file via FTP (remote file
permissions were not preserved).


Regards,

Tomas HozzaFrom d5133f6e8f19fad36d711c8b592608cac2b92c53 Mon Sep 17 00:00:00 2001
From: Tomas Hozza 
Date: Thu, 11 Jul 2013 17:52:28 +0200
Subject: [PATCH] Document missing options and fix --preserve-permissions

Added documentation for --regex-type and --preserve-permissions
options.

Fixed --preserve-permissions to work properly also if downloading a
single file from FTP.

Signed-off-by: Tomas Hozza 
---
 doc/ChangeLog | 4 
 doc/wget.texi | 9 +
 src/ChangeLog | 5 +
 src/ftp.c | 6 +++---
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/doc/ChangeLog b/doc/ChangeLog
index 1a70e3c..5e8fece 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,7 @@
+2013-07-11  Tomas Hozza  
+
+	* wget.texi: Document --regex-type and --preserve-permissions 
+
 2013-06-17  Dave Reisner   (tiny change)
 
 	* texi2pod.pl: Fix formatting error that causes build to fail with
diff --git a/doc/wget.texi b/doc/wget.texi
index 710f0ac..285e4b7 100644
--- a/doc/wget.texi
+++ b/doc/wget.texi
@@ -1816,6 +1816,10 @@ in some rare firewall configurations, active FTP actually works when
 passive FTP doesn't.  If you suspect this to be the case, use this
 option, or set @code{passive_ftp=off} in your init file.
 
+@cindex file permissions
+@item --preserve-permissions
+Preserve remote file permissions instead of permissions set by umask.
+
 @cindex symbolic links, retrieving
 @item --retr-symlinks
 Usually, when retrieving @sc{ftp} directories recursively and a symbolic
@@ -2057,6 +2061,11 @@ it will be treated as a pattern, rather than a suffix.
 @itemx --reject-regex @var{urlregex}
 Specify a regular expression to accept or reject the complete URL.
 
+@item --regex-type @var{regextype}
+Specify the regular expression type. Possible types are @samp{posix} or
+@samp{pcre}. Note that to be able to use @samp{pcre} type, wget has to be
+compiled with libpcre support.
+
 @item -D @var{domain-list}
 @itemx --domains=@var{domain-list}
 Set domains to be followed.  @var{domain-list} is a comma-separated list
diff --git a/src/ChangeLog b/src/ChangeLog
index 5b978eb..20e10c2 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,8 @@
+2013-07-11  Tomas Hozza  
+
+	* ftp.c (ftp_loop): Use ftp_retrieve_glob() also in case
+	--preserve-permissions was specified. 
+
 2013-04-26  Tomas Hozza   (tiny change)
 
 	* log.c (redirect_output): Use DEFAULT_LOGFILE in diagnostic message
diff --git a/src/ftp.c b/src/ftp.c
index 9b3d81c..4954258 100644
--- a/src/ftp.c
+++ b/src/ftp.c
@@ -2285,11 +2285,11 @@ ftp_loop (struct url *u, char **local_file, int *dt, struct url *proxy,
 file_part = u->path;
   ispattern = has_wildcards_p (file_part);
 }
-  if (ispattern || recursive || opt.timestamping)
+  if (ispattern || recursive || opt.timestamping || opt.preserve_perm)
 {
   /* ftp_retrieve_glob is a catch-all function that gets called
- if we need globbing, time-stamping or recursion.  Its
- third argument is just what we really need.  */
+ if we need globbing, time-stamping, recursion or preserve
+ permissions. Its third argument is just what we really need.  */
   res = ftp_retrieve_glob (u, &con,
ispattern ? GLOB_GLOBALL : GLOB_GETONE);
 }
-- 
1.8.3.1



[Bug-wget] A possible wget bug?

2013-07-11 Thread George R Goffe
Hi,

I'm downloading and would like to exclude several paths from the download. I 
put the command in a script (enclosed below) but it keeps doing the download. 
Am I messing something up? Am I not understanding the syntax?

My intent is to NOT download any of the i386, epel, macports, postgresql, or 
the ubuntu directories. I tried just coding 
i386,epel,macports,postgresql,ubuntu but that didn't work either. 

Any help you can give me in this would be wonderful.

Regards,

George...


wget -r -l999 --no-parent mirror.pnl.gov -X 
"mirror.pnl.gov/fedora/linux/development/19/i386,mirror.pnl.gov/fedora/linux/development/rawhide/i386,mirror.pnl.gov/fedora/linux/releases/15/Everything/i386,mirror.pnl.gov/fedora/linux/releases/15/Fedora/i386,mirror.pnl.gov/fedora/linux/releases/16/Everything/i386,mirror.pnl.gov/fedora/linux/releases/16/Fedora/i386,mirror.pnl.gov/fedora/linux/releases/17/Everything/i386,mirror.pnl.gov/fedora/linux/releases/17/Fedora/i386,mirror.pnl.gov/fedora/linux/releases/18/Everything/i386,mirror.pnl.gov/fedora/linux/releases/18/Fedora/i386,mirror.pnl.gov/fedora/linux/releases/18/Live/i386,mirror.pnl.gov/fedora/linux/updates/15/i386,mirror.pnl.gov/fedora/linux/updates/16/i386,mirror.pnl.gov/fedora/linux/updates/17/i386,mirror.pnl.gov/fedora/linux/updates/18/i386,mirror.pnl.gov/fedora/linux/updates/19/i386,mirror.pnl.gov/epel,mirror.pnl.gov/macports,mirror.pnl.gov/postgresql,mirror.pnl.gov/releases,mirror.pnl.gov/ubuntu"


Re: [Bug-wget] Wget manpage missing options

2013-07-11 Thread Tomas Hozza
Hi Giuseppe.

- Original Message -
> thanks to have tested it, indeed it was working in a different way.  I
> don't think this feature had any user before, considering how broken it
> is.
> 
> This patch should make it conformant to the documentation, if nobody
> complains, I will push this series to master.
> 
> From 9d5481e1f1f3f70f21e7f529f214da390bf6188a Mon Sep 17 00:00:00 2001
> From: Giuseppe Scrivano 
> Date: Wed, 10 Jul 2013 20:59:34 +0200
> Subject: [PATCH] Make --backups work as documented

With the patch, the --backups option works as expected, however it does
not apply cleanly on wget-1.14. But this is not a problem IMHO (it can
be easily tweaked to apply).

Thank you for documenting and fixing the option!

Regards,

Tomas Hozza



Re: [Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)

2013-07-11 Thread Tim Rühsen
Am Donnerstag, 11. Juli 2013 schrieb Tomas Hozza:
> Calling wget on https server with --timeout option does not work
> when the server does not answer SSL handshake. Note that this has
> been tested on wget-1.14 compiled with OpenSSL.

Hi,

here is the corresponding patch for GnuTLS.

Regards, Tim
From 5862c2e0e84838f40eda6332650bab10274bb211 Mon Sep 17 00:00:00 2001
From: Tim Ruehsen 
Date: Thu, 11 Jul 2013 14:29:20 +0200
Subject: [PATCH] add connect timeout to gnutls code

---
 src/ChangeLog |  6 ++
 src/gnutls.c  | 63 +--
 2 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 5b978eb..c39cfcb 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,9 @@
+2013-07-11  Tim Ruehsen  
+
+* gnutls.c (ssl_connect_wget): respect connect timeout
+
 2013-04-26  Tomas Hozza   (tiny change)
 
 	* log.c (redirect_output): Use DEFAULT_LOGFILE in diagnostic message
diff --git a/src/gnutls.c b/src/gnutls.c
index 54422fc..a3b4ecc 100644
--- a/src/gnutls.c
+++ b/src/gnutls.c
@@ -374,6 +374,9 @@ static struct transport_implementation wgnutls_transport =
 bool
 ssl_connect_wget (int fd, const char *hostname)
 {
+#ifdef F_GETFL
+  int flags = 0;
+#endif
   struct wgnutls_transport_context *ctx;
   gnutls_session_t session;
   int err,alert;
@@ -441,11 +444,55 @@ ssl_connect_wget (int fd, const char *hostname)
   return false;
 }
 
+  if (opt.connect_timeout)
+{
+#ifdef F_GETFL
+  flags = fcntl (fd, F_GETFL, 0);
+  if (flags < 0)
+return flags;
+  if (fcntl (fd, F_SETFL, flags | O_NONBLOCK))
+return -1;
+#else
+  /* XXX: Assume it was blocking before.  */
+  const int one = 1;
+  if (ioctl (fd, FIONBIO, &one) < 0)
+return -1;
+#endif
+}
+
   /* We don't stop the handshake process for non-fatal errors */
   do
 {
   err = gnutls_handshake (session);
-  if (err < 0)
+
+  if (opt.connect_timeout && err == GNUTLS_E_AGAIN)
+{
+  if (gnutls_record_get_direction (session))
+{
+  /* wait for writeability */
+  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_WRITE);
+}
+  else
+{
+  /* wait for readability */
+  err = select_fd (fd, opt.connect_timeout, WAIT_FOR_READ);
+}
+
+  if (err <= 0)
+{
+  if (err == 0)
+{
+  errno = ETIMEDOUT;
+		err = -1;
+}
+
+  break;
+}
+
+			 if (err <= 0)
+ break;
+}
+  else if (err < 0)
 {
   logprintf (LOG_NOTQUIET, "GnuTLS: %s\n", gnutls_strerror (err));
   if (err == GNUTLS_E_WARNING_ALERT_RECEIVED ||
@@ -461,6 +508,18 @@ ssl_connect_wget (int fd, const char *hostname)
 }
   while (err == GNUTLS_E_WARNING_ALERT_RECEIVED && gnutls_error_is_fatal (err) == 0);
 
+  if (opt.connect_timeout)
+{
+#ifdef F_GETFL
+  if (fcntl (fd, F_SETFL, flags) < 0)
+return -1;
+#else
+  const int zero = 0;
+  if (ioctl (fd, FIONBIO, &zero) < 0)
+return -1;
+#endif
+}
+
   if (err < 0)
 {
   gnutls_deinit (session);
@@ -468,7 +527,7 @@ ssl_connect_wget (int fd, const char *hostname)
 }
 
   ctx = xnew0 (struct wgnutls_transport_context);
-  ctx->session = session;
+	  ctx->session = session;
   fd_register_transport (fd, &wgnutls_transport, ctx);
   return true;
 }
-- 
1.8.3.2



signature.asc
Description: This is a digitally signed message part.


[Bug-wget] [PATCH] Due to keep_alive handling login to protected site using http://username:password@server/ does not work

2013-07-11 Thread Tomas Hozza
Hello.

I already proposed this bug in Savannah (https://savannah.gnu.org/bugs/?38554)
and will copy the discussion we already had there to answer some questions.

- BEGINNING 

Wed 20 Mar 2013 11:52:09 AM GMT, original submission: (Tomas Hozza)

This problem exists for example when trying to download something
from Boa web server with Basic authentication.

Problem description from Red Hat Bugzilla [1] (on Fedora 18 with wget-1.13.4 
AND wget-1.14)

The problem is that boa server answers to the initial GET packet with two 401
Unauthorised packets (at least in my case). From your dumps I see that boa is
answering with only one packet, but it is somehow broken. Although wget seems
to interpret it correctly it will close the connection without setting opened
socket variable to -1. Then it fails trying to write to that already closed
socket. This repeats for default maximum retries times.

Reason why wget-1.12 works is how it handle connections. In wget-1.12 the
connection established for GET request is closed before authentication and
the socket variable is set to -1. So when wget tries to authenticate to the
server, it establishes new connection and everything works.

Reason why --user and --password options works is that those credentials are
used as default for site-wide authentication. What wget does in this case is
pretty much the same as when you pass credentials in the URL with one 
difference.
When wget receives 401 Unauthorised from server and have those "default"
credentials set AND have NOT credentials in URL, then it adds the host address
to a list of hosts which request authentication. After that it also closes the
initial connection, then fails to write new GET request with authentication
credentials. BUT when it tries for the second time it will find out that the
host address is in list of hosts which request authentication and therefore
it will send GET request including authentication credentials. So that's why
it works.

I think the proper fix for this would be to set the socket variable to -1
in case of a problem when authenticating and closing the socket. This will
make sure that wget will not try to write data to already closed socket
but rather it will create new connection.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=912358

---

Wed 20 Mar 2013 11:56:06 AM GMT, comment #1: (Tomas Hozza)

I think Bug https://savannah.gnu.org/bugs/?34141 may be connected
to this one. Sorry for the new Bug BUT in 34141 was no activity
what so ever so hopefully now it will be better.

---

Thu 04 Apr 2013 05:08:30 PM GMT, comment #2: (Giuseppe Scrivano)

wouldn't that leak the socket if it is not valid? Should you perhaps use 
CLOSE_INVALIDATE?

Can you also add the entry to the src/ChangeLog file?

---

Fri 05 Apr 2013 07:42:13 AM GMT, comment #3: (Tomas Hozza)

Thank you for your response.

> wouldn't that leak the socket if it is not valid?


I scanned wget-1.14 without the patch and with the patch
using static analysis tool Coverity 6.5.1. It didn't find
any added errors, so there should be NO resource leak. I know
static analysis is not perfect, but from my experiences with
Coverity I must say it is very reliable.

Anyway there is the following construction:

if (persistent_available_p(...))
{
...
}
else if (host_lookup_failed)
{
...
}
else if (sock != -1)
{
sock = -1;
}

persistent_available_p(...) returns false only if there is NO active
connection OR if we want to use new connection (SSL) or if there
was some problem with connection and it was invalidated. In other
words if persistent_available_p(...) doesn't returned true, then we
have to create new connection anyway.

> Should you perhaps use CLOSE_INVALIDATE?


Problem is that persistent_available_p(...) deals with "pconn.socket"
while outside of this function we use "sock" variable. Even if we used
CLOSE_INVALIDATE() in persistent_available_p() we have to set "sock"
to -1 because we might be using already closed socket. Also calling
CLOSE_INVALIDATE() on "sock" if persistent_available_p() returned
false would be not a good idea, because we might be closing already
closed socket.

> Can you also add the entry to the src/ChangeLog file?


I attached new patch with entry in src/ChangeLog.

 END ---

So if anyone could have a look at the patch, review it and possibly
merge it into the git, it would be great. If you have any questions
regarding the patch, please let me know.

Regards,

Tomas HozzaFrom 7e1c434415711ec11ab05f50fd6a898d16eb133e Mon Sep 17 00:00:00 2001
From: Tomas Hozza 
Date: Thu, 11 Jul 2013 13:22:43 +0200
Subject: [PATCH] Set sock variable to -1 if no persistent conn exists

Wget should set sock variable to -1 if no persistent
connection exists. Function persistent_available_p()
tests persistent connect

[Bug-wget] [PATCH] timeout option is ingnored if host does not answer SSL handshake (openssl)

2013-07-11 Thread Tomas Hozza
Hi.

Calling wget on https server with --timeout option does not work
when the server does not answer SSL handshake. Note that this has
been tested on wget-1.14 compiled with OpenSSL.

This issue can be reproduced as follows:

- on first terminal run:
# nc -l localhost 12345

- on second terminal run:
wget --timeout=2 --no-check-certificate https://localhost:12345

Without the attached patch wget does not exit after specified
timeout.

Regards,

Tomas HozzaFrom b565c9fcf37fb8d71b3c338f0ec8982295e283fe Mon Sep 17 00:00:00 2001
From: Karsten Hopp 
Date: Thu, 11 Jul 2013 11:27:35 +0200
Subject: [PATCH] Fix timeout option when used with SSL

Previously wget didn't honor the --timeout option if the remote host did
not answer SSL handshake

Signed-off-by: Tomas Hozza 
---
 src/openssl.c | 62 ++-
 1 file changed, 53 insertions(+), 9 deletions(-)

diff --git a/src/openssl.c b/src/openssl.c
index 3924e41..189b334 100644
--- a/src/openssl.c
+++ b/src/openssl.c
@@ -256,19 +256,42 @@ struct openssl_transport_context {
   char *last_error; /* last error printed with openssl_errstr */
 };
 
-static int
-openssl_read (int fd, char *buf, int bufsize, void *arg)
-{
-  int ret;
-  struct openssl_transport_context *ctx = arg;
+struct openssl_read_args {
+  int fd;
+  struct openssl_transport_context *ctx;
+  char *buf;
+  int bufsize;
+  int retval;
+};
+
+static void openssl_read_callback(void *arg) {
+  struct openssl_read_args *args = (struct openssl_read_args *) arg;
+  struct openssl_transport_context *ctx = args->ctx;
   SSL *conn = ctx->conn;
+  char *buf = args->buf;
+  int bufsize = args->bufsize;
+  int ret;
+
   do
 ret = SSL_read (conn, buf, bufsize);
-  while (ret == -1
- && SSL_get_error (conn, ret) == SSL_ERROR_SYSCALL
+  while (ret == -1 && SSL_get_error (conn, ret) == SSL_ERROR_SYSCALL
  && errno == EINTR);
+  args->retval = ret;
+}
 
-  return ret;
+static int
+openssl_read (int fd, char *buf, int bufsize, void *arg)
+{
+  struct openssl_read_args args;
+  args.fd = fd;
+  args.buf = buf;
+  args.bufsize = bufsize;
+  args.ctx = (struct openssl_transport_context*) arg;
+
+  if (run_with_timeout(opt.read_timeout, openssl_read_callback, &args)) {
+return -1;
+  }
+  return args.retval;
 }
 
 static int
@@ -386,6 +409,18 @@ static struct transport_implementation openssl_transport = {
   openssl_peek, openssl_errstr, openssl_close
 };
 
+struct scwt_context {
+  SSL *ssl;
+  int result;
+};
+
+static void
+ssl_connect_with_timeout_callback(void *arg)
+{
+  struct scwt_context *ctx = (struct scwt_context *)arg;
+  ctx->result = SSL_connect(ctx->ssl);
+}
+
 /* Perform the SSL handshake on file descriptor FD, which is assumed
to be connected to an SSL server.  The SSL handle provided by
OpenSSL is registered with the file descriptor FD using
@@ -398,6 +433,7 @@ bool
 ssl_connect_wget (int fd, const char *hostname)
 {
   SSL *conn;
+  struct scwt_context scwt_ctx;
   struct openssl_transport_context *ctx;
 
   DEBUGP (("Initiating SSL handshake.\n"));
@@ -425,7 +461,14 @@ ssl_connect_wget (int fd, const char *hostname)
   if (!SSL_set_fd (conn, FD_TO_SOCKET (fd)))
 goto error;
   SSL_set_connect_state (conn);
-  if (SSL_connect (conn) <= 0 || conn->state != SSL_ST_OK)
+
+  scwt_ctx.ssl = conn;
+  if (run_with_timeout(opt.read_timeout, ssl_connect_with_timeout_callback, 
+   &scwt_ctx)) {
+DEBUGP (("SSL handshake timed out.\n"));
+goto timeout;
+  }
+  if (scwt_ctx.result <= 0 || conn->state != SSL_ST_OK)
 goto error;
 
   ctx = xnew0 (struct openssl_transport_context);
@@ -441,6 +484,7 @@ ssl_connect_wget (int fd, const char *hostname)
  error:
   DEBUGP (("SSL handshake failed.\n"));
   print_errors ();
+ timeout:
   if (conn)
 SSL_free (conn);
   return false;
-- 
1.8.3.1



Re: [Bug-wget] Listing fails with specific ftp-Server

2013-07-11 Thread Simon Winiger


Am 09.07.2013 23:51, schrieb Giuseppe Scrivano:

Simon Winiger  writes:


Hello,

i have the problem that wget cannot download from a specific
ftp-Server, because it can't get a listing (debug code below (user+pw
X-ed), produced html file contains no info). I searched the web&this
list, but found only some info from 2001 and wget 1.6 (?)

Maybe someone can give a hint?

Thanks for the report, can you run wget with --no-remove-listing and
attach the .listing file?


Hello Guiseppe,

thanks for your support. I did this, the .listing-file contains only 1 
line: "z:\-a not found"

(The ftp-Server works with windows-ftp-clients like cyberduck or Winscp)

I wrote to the admin of the ftp server, he answered: the ftp-server used 
can't cope with regular expressions, and of course supplies no 
index.htm(l). He now has setup a mini-http-server which works with wget. 
(So the problem is solved for me personally)


--
Simon Winiger, M.Sc.
Division Thermal Systems and Buildings
Fraunhofer Institut für Solare Energiesysteme ISE
Heidenhofstrasse 2, 79110 Freiburg, Germany
Tel: +49 (0)761/ 4588-5129
Email: simon.wini...@ise.fraunhofer.de
http://www.ise.fraunhofer.de




Re: [Bug-wget] Wget manpage missing options

2013-07-11 Thread Giuseppe Scrivano
"Steven M. Schweda"  writes:

>I also assumed that the patch was based on something newer than the
> original 1.14 source kit I was using, so, even if I had noticed a
> problem, I probably would not have been suspicious.  For example, around
> here, nothing worked right until I made this change:

I think the patches should apply without conflicts on 1.14.  Could you
try to apply the patch I submitted yesterday and see if it works as
expected on VMS?  If you want I can prepare a new tarball for you.

-- 
Giuseppe