Re: [Bug-wget] Regression in git master branch (commit 8e6de1fb5ff0ca0c749da7db634a1b1e3a1215a2)

2014-10-20 Thread Gabriel Somlo
On Tue, Oct 21, 2014 at 01:49:41AM +0200, ?ngel Gonz?lez wrote:
> On 21/10/14 01:20, Gabriel Somlo wrote:
> >I think I found a regression in the development branch of wget, which
> >wasn't present in 1.15. Using "git bisect", it appears the offending
> >commit was 8e6de1fb5ff0ca0c749da7db634a1b1e3a1215a2 ("Drop usage of
> >strncpy") from Jun. 9 2014.
> >
> >To reproduce, grab a "wide and shallow" copy of wikipedia.org, like so:
> >
> >
> >   wget -rpkEHN -e robots=off --random-wait -t 2 -U mozilla -l 1 \
> >-P ./vservers wikipedia.org
> >
> >
> >then (some 20 minutes or so later) open ./vservers/wikipedia.org/index.html
> >in your (firefox-32.0.2-1.fc20.x86_64) browser as a file.
> >
> Thanks for your report!
> 
> After a quick look, get_uri_string seems to not be taking into account the
> end of the url() parameter (it was before 8e6de1fb5).
> Can you check if this patch fixes the issue?

Yeah, that took care of it for me !

Thanks again,
--Gabriel

> diff --git a/src/css-url.c b/src/css-url.c
> index c605798..34a20af 100644
> --- a/src/css-url.c
> +++ b/src/css-url.c
> @@ -72,6 +72,8 @@ extern int yylex (void);
>  static char *
>  get_uri_string (const char *at, int *pos, int *length)
>  {
> +  char *uri;
> +
>if (0 != strncasecmp (at + *pos, "url(", 4))
>  return NULL;
> 
> @@ -97,7 +99,14 @@ get_uri_string (const char *at, int *pos, int *length)
>*length -= 2;
>  }
> 
> -  return xstrdup (at + *pos);
> +  uri = xmalloc (*length + 1);
> +  if (uri)
> +{
> +  strncpy (uri, at + *pos, *length);
> +  uri[*length] = '\0';
> +}
> +
> +  return uri;
>  }
> 
>  void
> 
> 



Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-20 Thread Yousong Zhou
Hi, Pär

On 17 October 2014 03:50, Pär Karlsson  wrote:
> Hi, I fould a potential gotcha when playing with clang's code analysis tool.
>
> The concat_strings function silently stopped counting string lengths when
> given more than 5 arguments. clang warned about potential garbage values in
> the saved_lengths array, so I redid it with this approach.

After taking a closer look, I guess the old implementation is fine.
saved_length[] is used as a buffer for lengths of the first 5
arguments and there is a bound check with its length.  Maybe it's a
false-positive from clang tool?

Sorry for the noise...

Regards.

yousong

>
> All tests working ok with this patch.
>
> This is my first patch to this list, by the way. I'd be happy to help out
> more in the future.
>
> Best regards,
>
> /Pär Karlsson, Sweden
>
> 
>
> commit 2d855670e0e1fbe578506b376cdd40b0e465d3ef
> Author: Pär Karlsson 
> Date:   Thu Oct 16 21:41:36 2014 +0200
>
> Updated ChangeLog
>
> diff --git a/src/ChangeLog b/src/ChangeLog
> index 1c4e2d5..1e39475 100644
> --- a/src/ChangeLog
> +++ b/src/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-10-16  Pär Karlsson  
> +
> +   * utils.c (concat_strings): fixed arbitrary limit of 5 arguments to
> +   function
> +
>  2014-05-03  Tim Ruehsen  
>
> * retr.c (retrieve_url): fixed memory leak
>
> commit 1fa9ff274dcb6e5a2dbbbc7d3fe2f139059c47f1
> Author: Pär Karlsson 
> Date:   Wed Oct 15 00:00:31 2014 +0200
>
> Generalized concat_strings argument length
>
>   The concat_strings function seemed arbitrary to only accept a maximum
>   of 5 arguments (the others were silently ignored).
>
>   Also it had a potential garbage read of the values in the array.
>   Updated with xmalloc/xrealloc/free
>
> diff --git a/src/utils.c b/src/utils.c
> index 78c282e..93c9ddc 100644
> --- a/src/utils.c
> +++ b/src/utils.c
> @@ -356,7 +356,8 @@ char *
>  concat_strings (const char *str0, ...)
>  {
>va_list args;
> -  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
> +  size_t psize = sizeof(int);
> +  int *saved_lengths = xmalloc (psize);
>char *ret, *p;
>
>const char *next_str;
> @@ -370,8 +371,8 @@ concat_strings (const char *str0, ...)
>for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
>  {
>int len = strlen (next_str);
> -  if (argcount < countof (saved_lengths))
> -saved_lengths[argcount++] = len;
> +  saved_lengths[argcount++] = len;
> +  xrealloc(saved_lengths, psize * argcount);
>total_length += len;
>  }
>va_end (args);
> @@ -393,7 +394,7 @@ concat_strings (const char *str0, ...)
>  }
>va_end (args);
>*p = '\0';
> -
> +  free(saved_lengths);
>return ret;
>  }
>  ^L



Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-20 Thread Yousong Zhou
On 21 October 2014 10:02, Yousong Zhou  wrote:
> Hi, Pär.  I got a few comments inline.
>
> On 21 October 2014 05:47, Pär Karlsson  wrote:
>> Whoops, I realised I failed on the GNU coding standards, please disregard
>> the last one; the patch below should be better.
>>
>> My apologies :-/
>>
>> /Pär
>>
>> diff --git a/src/ChangeLog b/src/ChangeLog
>> index d5aeca0..87abd85 100644
>> --- a/src/ChangeLog
>> +++ b/src/ChangeLog
>> @@ -1,3 +1,8 @@
>> +2014-10-20 Pär Karlsson  
>> +
>> +   * utils.c (concat_strings): got rid of double loop, cleaned up
>> potential
>> +   memory corruption if concat_strings was called with more than five
>> args
>> +
>>  2014-10-16  Tim Ruehsen  
>>
>> * url.c (url_parse): little code cleanup
>> diff --git a/src/utils.c b/src/utils.c
>> index 78c282e..5f359e0 100644
>> --- a/src/utils.c
>> +++ b/src/utils.c
>> @@ -356,42 +356,36 @@ char *
>>  concat_strings (const char *str0, ...)
>>  {
>>va_list args;
>> -  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
>>char *ret, *p;
>>
>>const char *next_str;
>> -  int total_length = 0;
>> -  size_t argcount;
>> +  size_t len;
>> +  size_t total_length = 0;
>> +  size_t charsize = sizeof (char);
>
> I am not sure here.  Do we always assume sizeof(char) to be 1 for
> platforms supported by wget?
>
>> +  size_t chunksize = 64;
>> +  size_t bufsize = 64;
>> +
>> +  p = ret = xmalloc (charsize * bufsize);
>>
>>/* Calculate the length of and allocate the resulting string. */
>>
>> -  argcount = 0;
>>va_start (args, str0);
>>for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
>>  {
>> -  int len = strlen (next_str);
>> -  if (argcount < countof (saved_lengths))
>> -saved_lengths[argcount++] = len;
>> +  len = strlen (next_str);
>> +  if (len == 0)
>> +continue;
>>total_length += len;
>> -}
>> -  va_end (args);
>> -  p = ret = xmalloc (total_length + 1);
>> -
>> -  /* Copy the strings into the allocated space. */
>> -
>> -  argcount = 0;
>> -  va_start (args, str0);
>> -  for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
>> -{
>> -  int len;
>> -  if (argcount < countof (saved_lengths))
>> -len = saved_lengths[argcount++];
>> -  else
>> -len = strlen (next_str);
>> +  if (total_length > bufsize)
>> +  {
>> +bufsize += chunksize;
>
> Should be `bufsize = total_length` ?
>
>> +ret = xrealloc (ret, charsize * bufsize);
>> +  }
>>memcpy (p, next_str, len);
>
> Xrealloc may return a new block different from p, so memcpy(p, ...)
> may not be what you want.
>
>>p += len;
>>  }
>>va_end (args);
>> +  ret = xrealloc (ret, charsize * total_length + 1);
>>*p = '\0';
>
> Malloc takes time.  How about counting total_length in one loop and
> doing the copy in another?

I mean, we can skip the strlen part and just do strcpy in the second
loop as we already know we have enough space in the dest buffer for
all those null-terminated arguments.

 yousong



Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-20 Thread Yousong Zhou
Hi, Pär.  I got a few comments inline.

On 21 October 2014 05:47, Pär Karlsson  wrote:
> Whoops, I realised I failed on the GNU coding standards, please disregard
> the last one; the patch below should be better.
>
> My apologies :-/
>
> /Pär
>
> diff --git a/src/ChangeLog b/src/ChangeLog
> index d5aeca0..87abd85 100644
> --- a/src/ChangeLog
> +++ b/src/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-10-20 Pär Karlsson  
> +
> +   * utils.c (concat_strings): got rid of double loop, cleaned up
> potential
> +   memory corruption if concat_strings was called with more than five
> args
> +
>  2014-10-16  Tim Ruehsen  
>
> * url.c (url_parse): little code cleanup
> diff --git a/src/utils.c b/src/utils.c
> index 78c282e..5f359e0 100644
> --- a/src/utils.c
> +++ b/src/utils.c
> @@ -356,42 +356,36 @@ char *
>  concat_strings (const char *str0, ...)
>  {
>va_list args;
> -  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
>char *ret, *p;
>
>const char *next_str;
> -  int total_length = 0;
> -  size_t argcount;
> +  size_t len;
> +  size_t total_length = 0;
> +  size_t charsize = sizeof (char);

I am not sure here.  Do we always assume sizeof(char) to be 1 for
platforms supported by wget?

> +  size_t chunksize = 64;
> +  size_t bufsize = 64;
> +
> +  p = ret = xmalloc (charsize * bufsize);
>
>/* Calculate the length of and allocate the resulting string. */
>
> -  argcount = 0;
>va_start (args, str0);
>for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
>  {
> -  int len = strlen (next_str);
> -  if (argcount < countof (saved_lengths))
> -saved_lengths[argcount++] = len;
> +  len = strlen (next_str);
> +  if (len == 0)
> +continue;
>total_length += len;
> -}
> -  va_end (args);
> -  p = ret = xmalloc (total_length + 1);
> -
> -  /* Copy the strings into the allocated space. */
> -
> -  argcount = 0;
> -  va_start (args, str0);
> -  for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
> -{
> -  int len;
> -  if (argcount < countof (saved_lengths))
> -len = saved_lengths[argcount++];
> -  else
> -len = strlen (next_str);
> +  if (total_length > bufsize)
> +  {
> +bufsize += chunksize;

Should be `bufsize = total_length` ?

> +ret = xrealloc (ret, charsize * bufsize);
> +  }
>memcpy (p, next_str, len);

Xrealloc may return a new block different from p, so memcpy(p, ...)
may not be what you want.

>p += len;
>  }
>va_end (args);
> +  ret = xrealloc (ret, charsize * total_length + 1);
>*p = '\0';

Malloc takes time.  How about counting total_length in one loop and
doing the copy in another?

Regards.

yousong

>
>return ret;
>



Re: [Bug-wget] Send Content-Length with POST 0 length body

2014-10-20 Thread Darshit Shah
I think we should handle PUT and POST alone. And add additional logic to
handle a 411 response to any request.

Wget has traditionally supported POST requests and PUT has a very similar
usage, for anything else, the user should use the - - send-header commands.
On 20-Oct-2014 1:16 pm, "Tim Rühsen"  wrote:

Am Montag, 20. Oktober 2014, 20:08:05 schrieb Matthew Atkinson:
> On 20/10/14 16:26, Giuseppe Scrivano wrote:
> > should we do this only for POST requests?  What about doing it in any
> > case that "!(opt.body_data || opt.body_file)"?
> >
> > Regards,
> > Giuseppe
>
> From http://tools.ietf.org/html/rfc7230#section-3.3.2
>
> > A user agent SHOULD send a Content-Length in a request message when
> > no Transfer-Encoding is sent and the request method defines a
> > meaning for an enclosed payload body.  For example, a Content-Length
> > header field is normally sent in a POST request even when the value
> > is 0 (indicating an empty payload body).  A user agent SHOULD
> > NOT send a Content-Length header field when the request message
> > does not contain a payload body and the method semantics do not
> > anticipate such a body.
>
> I would think that we should do this for POST and PUT, but nothing
> else

Since Wget does not explicitly support PUT, we may just care for the POST
request. Since servers may reject a POST without Content-Length, we are
better
off with supplying one. Since PUT (and also PATCH) request 'anticipate' a
body,
we could also care for these.

Matthew, could you also check for 'put and 'patch' request and send an
amended
version of the patch. FYI, HTTP PATCH request is rfc5789.


RFC 7230
3.3.2. Content-Length

   A user agent SHOULD send a Content-Length in a request message when
   no Transfer-Encoding is sent and the request method defines a meaning
   for an enclosed payload body.  For example, a Content-Length header
   field is normally sent in a POST request even when the value is 0
   (indicating an empty payload body).  A user agent SHOULD NOT send a
   Content-Length header field when the request message does not contain
   a payload body and the method semantics do not anticipate such a
   body.


RFC7231
6.5.10. 411 Length Required

   The 411 (Length Required) status code indicates that the server
   refuses to accept the request without a defined Content-Length
   (Section 3.3.2 of [RFC7230]).  The client MAY repeat the request if
   it adds a valid Content-Length header field containing the length of
   the message body in the request message.


Tim


Re: [Bug-wget] Regression in git master branch (commit 8e6de1fb5ff0ca0c749da7db634a1b1e3a1215a2)

2014-10-20 Thread Ángel González

On 21/10/14 01:20, Gabriel Somlo wrote:

Hi Giuseppe,

I think I found a regression in the development branch of wget, which
wasn't present in 1.15. Using "git bisect", it appears the offending
commit was 8e6de1fb5ff0ca0c749da7db634a1b1e3a1215a2 ("Drop usage of
strncpy") from Jun. 9 2014.

To reproduce, grab a "wide and shallow" copy of wikipedia.org, like so:


   wget -rpkEHN -e robots=off --random-wait -t 2 -U mozilla -l 1 \
-P ./vservers wikipedia.org


then (some 20 minutes or so later) open ./vservers/wikipedia.org/index.html
in your (firefox-32.0.2-1.fc20.x86_64) browser as a file.

Before commit 8e6de1fb5ff0ca0c749da7db634a1b1e3a1215a2, the page looks
normal. After said commit, it looks (for me) like in the attached
screenshot.

I couldn't revert the commit, as subsequent changes have accumulated
since it was applied. I'm also relatively new to the code base, so
figuring out exactly what went wrong might be significantly faster
for you :)

I can (if necessary) open a ticket in the bug tracker (e.g. for a
better place to upload the screenshot, which may not make it through
on the mailing list), just let me know what you think.

Thanks much,
--Gabriel

Thanks for your report!

After a quick look, get_uri_string seems to not be taking into account 
the end of the url() parameter (it was before 8e6de1fb5).

Can you check if this patch fixes the issue?

diff --git a/src/css-url.c b/src/css-url.c
index c605798..34a20af 100644
--- a/src/css-url.c
+++ b/src/css-url.c
@@ -72,6 +72,8 @@ extern int yylex (void);
 static char *
 get_uri_string (const char *at, int *pos, int *length)
 {
+  char *uri;
+
   if (0 != strncasecmp (at + *pos, "url(", 4))
 return NULL;

@@ -97,7 +99,14 @@ get_uri_string (const char *at, int *pos, int *length)
   *length -= 2;
 }

-  return xstrdup (at + *pos);
+  uri = xmalloc (*length + 1);
+  if (uri)
+{
+  strncpy (uri, at + *pos, *length);
+  uri[*length] = '\0';
+}
+
+  return uri;
 }

 void





Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-20 Thread Pär Karlsson
Whoops, I realised I failed on the GNU coding standards, please disregard
the last one; the patch below should be better.

My apologies :-/

/Pär

diff --git a/src/ChangeLog b/src/ChangeLog
index d5aeca0..87abd85 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,8 @@
+2014-10-20 Pär Karlsson  
+
+   * utils.c (concat_strings): got rid of double loop, cleaned up
potential
+   memory corruption if concat_strings was called with more than five
args
+
 2014-10-16  Tim Ruehsen  

* url.c (url_parse): little code cleanup
diff --git a/src/utils.c b/src/utils.c
index 78c282e..5f359e0 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -356,42 +356,36 @@ char *
 concat_strings (const char *str0, ...)
 {
   va_list args;
-  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
   char *ret, *p;

   const char *next_str;
-  int total_length = 0;
-  size_t argcount;
+  size_t len;
+  size_t total_length = 0;
+  size_t charsize = sizeof (char);
+  size_t chunksize = 64;
+  size_t bufsize = 64;
+
+  p = ret = xmalloc (charsize * bufsize);

   /* Calculate the length of and allocate the resulting string. */

-  argcount = 0;
   va_start (args, str0);
   for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
 {
-  int len = strlen (next_str);
-  if (argcount < countof (saved_lengths))
-saved_lengths[argcount++] = len;
+  len = strlen (next_str);
+  if (len == 0)
+continue;
   total_length += len;
-}
-  va_end (args);
-  p = ret = xmalloc (total_length + 1);
-
-  /* Copy the strings into the allocated space. */
-
-  argcount = 0;
-  va_start (args, str0);
-  for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
-{
-  int len;
-  if (argcount < countof (saved_lengths))
-len = saved_lengths[argcount++];
-  else
-len = strlen (next_str);
+  if (total_length > bufsize)
+  {
+bufsize += chunksize;
+ret = xrealloc (ret, charsize * bufsize);
+  }
   memcpy (p, next_str, len);
   p += len;
 }
   va_end (args);
+  ret = xrealloc (ret, charsize * total_length + 1);
   *p = '\0';

   return ret;

2014-10-20 22:14 GMT+02:00 Pär Karlsson :

> Thank you for your feedback and suggestions.
>
> I thought about this during the weekend and figured it could be done much
> more efficiently, by only looping through the arguments once. Also, I
> realized the the original version would have segfaulted if called with more
> than 5 args, since the destination memory never got allocated before the
> memcpy (in the current code, it never is, though!).
>
> I cleaned up the code according to the GNU coding standards as well. The
> test suite rolls with this, and I think it looks better (although the
> function is only really called in a handful of places in all of wget).
>
> Best regards,
>
> /Pär
>
> Patch below:
>
> diff --git a/src/ChangeLog b/src/ChangeLog
> index d5aeca0..87abd85 100644
> --- a/src/ChangeLog
> +++ b/src/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-10-20 Pär Karlsson  
> +
> +   * utils.c (concat_strings): got rid of double loop, cleaned up
> potential
> +   memory corruption if concat_strings was called with more than five
> args
> +
>  2014-10-16  Tim Ruehsen  
>
> * url.c (url_parse): little code cleanup
> diff --git a/src/utils.c b/src/utils.c
> index 78c282e..dbeb9fe 100644
> --- a/src/utils.c
> +++ b/src/utils.c
> @@ -356,42 +356,36 @@ char *
>  concat_strings (const char *str0, ...)
>  {
>va_list args;
> -  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
>char *ret, *p;
>
>const char *next_str;
> -  int total_length = 0;
> -  size_t argcount;
> +  size_t len;
> +  size_t total_length = 0;
> +  size_t charsize = sizeof (char);
> +  size_t chunksize = 64;
> +  size_t bufsize = 64;
> +
> +  p = ret = xmalloc (charsize * bufsize);
>
>/* Calculate the length of and allocate the resulting string. */
>
> -  argcount = 0;
>va_start (args, str0);
>for (next_str = str0; next_str != NULL; next_str = va_arg (args, char
> *))
>  {
> -  int len = strlen (next_str);
> -  if (argcount < countof (saved_lengths))
> -saved_lengths[argcount++] = len;
> +  len = strlen (next_str);
> +  if (len == 0) {
> +  continue;
> +  }
>total_length += len;
> -}
> -  va_end (args);
> -  p = ret = xmalloc (total_length + 1);
> -
> -  /* Copy the strings into the allocated space. */
> -
> -  argcount = 0;
> -  va_start (args, str0);
> -  for (next_str = str0; next_str != NULL; next_str = va_arg (args, char
> *))
> -{
> -  int len;
> -  if (argcount < countof (saved_lengths))
> -len = saved_lengths[argcount++];
> -  else
> -len = strlen (next_str);
> +  if (total_length > bufsize) {
> +  bufsize += chunksize;
> +  ret = xrealloc (ret, charsize * bufsize);
> +  }
>memcpy (p, next_str, len);
>p += 

Re: [Bug-wget] Send Content-Length with POST 0 length body

2014-10-20 Thread Tim Rühsen
Am Montag, 20. Oktober 2014, 20:08:05 schrieb Matthew Atkinson:
> On 20/10/14 16:26, Giuseppe Scrivano wrote:
> > should we do this only for POST requests?  What about doing it in any
> > case that "!(opt.body_data || opt.body_file)"?
> > 
> > Regards,
> > Giuseppe
> 
> From http://tools.ietf.org/html/rfc7230#section-3.3.2
> 
> > A user agent SHOULD send a Content-Length in a request message when
> > no Transfer-Encoding is sent and the request method defines a
> > meaning for an enclosed payload body.  For example, a Content-Length
> > header field is normally sent in a POST request even when the value
> > is 0 (indicating an empty payload body).  A user agent SHOULD
> > NOT send a Content-Length header field when the request message
> > does not contain a payload body and the method semantics do not
> > anticipate such a body.
> 
> I would think that we should do this for POST and PUT, but nothing
> else

Since Wget does not explicitly support PUT, we may just care for the POST 
request. Since servers may reject a POST without Content-Length, we are better 
off with supplying one. Since PUT (and also PATCH) request 'anticipate' a body, 
we could also care for these.

Matthew, could you also check for 'put and 'patch' request and send an amended 
version of the patch. FYI, HTTP PATCH request is rfc5789.


RFC 7230
3.3.2. Content-Length

   A user agent SHOULD send a Content-Length in a request message when
   no Transfer-Encoding is sent and the request method defines a meaning
   for an enclosed payload body.  For example, a Content-Length header
   field is normally sent in a POST request even when the value is 0
   (indicating an empty payload body).  A user agent SHOULD NOT send a
   Content-Length header field when the request message does not contain
   a payload body and the method semantics do not anticipate such a
   body.


RFC7231
6.5.10. 411 Length Required

   The 411 (Length Required) status code indicates that the server
   refuses to accept the request without a defined Content-Length
   (Section 3.3.2 of [RFC7230]).  The client MAY repeat the request if
   it adds a valid Content-Length header field containing the length of
   the message body in the request message.


Tim




Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-20 Thread Pär Karlsson
Thank you for your feedback and suggestions.

I thought about this during the weekend and figured it could be done much
more efficiently, by only looping through the arguments once. Also, I
realized the the original version would have segfaulted if called with more
than 5 args, since the destination memory never got allocated before the
memcpy (in the current code, it never is, though!).

I cleaned up the code according to the GNU coding standards as well. The
test suite rolls with this, and I think it looks better (although the
function is only really called in a handful of places in all of wget).

Best regards,

/Pär

Patch below:

diff --git a/src/ChangeLog b/src/ChangeLog
index d5aeca0..87abd85 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,8 @@
+2014-10-20 Pär Karlsson  
+
+   * utils.c (concat_strings): got rid of double loop, cleaned up
potential
+   memory corruption if concat_strings was called with more than five
args
+
 2014-10-16  Tim Ruehsen  

* url.c (url_parse): little code cleanup
diff --git a/src/utils.c b/src/utils.c
index 78c282e..dbeb9fe 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -356,42 +356,36 @@ char *
 concat_strings (const char *str0, ...)
 {
   va_list args;
-  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
   char *ret, *p;

   const char *next_str;
-  int total_length = 0;
-  size_t argcount;
+  size_t len;
+  size_t total_length = 0;
+  size_t charsize = sizeof (char);
+  size_t chunksize = 64;
+  size_t bufsize = 64;
+
+  p = ret = xmalloc (charsize * bufsize);

   /* Calculate the length of and allocate the resulting string. */

-  argcount = 0;
   va_start (args, str0);
   for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
 {
-  int len = strlen (next_str);
-  if (argcount < countof (saved_lengths))
-saved_lengths[argcount++] = len;
+  len = strlen (next_str);
+  if (len == 0) {
+  continue;
+  }
   total_length += len;
-}
-  va_end (args);
-  p = ret = xmalloc (total_length + 1);
-
-  /* Copy the strings into the allocated space. */
-
-  argcount = 0;
-  va_start (args, str0);
-  for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
-{
-  int len;
-  if (argcount < countof (saved_lengths))
-len = saved_lengths[argcount++];
-  else
-len = strlen (next_str);
+  if (total_length > bufsize) {
+  bufsize += chunksize;
+  ret = xrealloc (ret, charsize * bufsize);
+  }
   memcpy (p, next_str, len);
   p += len;
 }
   va_end (args);
+  ret = xrealloc (ret, charsize * total_length + 1);
   *p = '\0';

   return ret;



2014-10-19 16:16 GMT+02:00 Giuseppe Scrivano :

> Pär Karlsson  writes:
>
> > Hi, I fould a potential gotcha when playing with clang's code analysis
> tool.
> >
> > The concat_strings function silently stopped counting string lengths when
> > given more than 5 arguments. clang warned about potential garbage values
> in
> > the saved_lengths array, so I redid it with this approach.
> >
> > All tests working ok with this patch.
>
> thanks for your contribution.  I've just few comments:
>
>
> > commit 2d855670e0e1fbe578506b376cdd40b0e465d3ef
> > Author: Pär Karlsson 
> > Date:   Thu Oct 16 21:41:36 2014 +0200
> >
> > Updated ChangeLog
> >
>
> we usually update the changelog in the same commit that made the change,
> so please squash these two commits into one.
>
> Also, it doesn't apply on current git master, as it seems to be based on
> a old version of git from the ChangeLog file context, could you rebase
> onto the master branch?
>
>
> > diff --git a/src/utils.c b/src/utils.c
> > index 78c282e..93c9ddc 100644
> > --- a/src/utils.c
> > +++ b/src/utils.c
> > @@ -356,7 +356,8 @@ char *
> >  concat_strings (const char *str0, ...)
> >  {
> >va_list args;
> > -  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
> > +  size_t psize = sizeof(int);
>
> please leave a space between sizeof and '(' as mandated by the GNU
> coding standards.
>
> > +  int *saved_lengths = xmalloc (psize);
> >char *ret, *p;
> >
> >const char *next_str;
> > @@ -370,8 +371,8 @@ concat_strings (const char *str0, ...)
> >for (next_str = str0; next_str != NULL; next_str = va_arg (args, char
> *))
> >  {
> >int len = strlen (next_str);
> > -  if (argcount < countof (saved_lengths))
> > -saved_lengths[argcount++] = len;
> > +  saved_lengths[argcount++] = len;
> > +  xrealloc(saved_lengths, psize * argcount);
>
> same here.
>
> >total_length += len;
> >  }
> >va_end (args);
> > @@ -393,7 +394,7 @@ concat_strings (const char *str0, ...)
> >  }
> >va_end (args);
> >*p = '\0';
> > -
> > +  free(saved_lengths);
>
> and here.
>
> Regards,
> Giuseppe
>


Re: [Bug-wget] [Win32 Patch] console-close events

2014-10-20 Thread Ángel González

On 20/10/14 03:03, Darshit Shah wrote:

On 06/17, Gisle Vanem wrote:

"Ángel González"  wrote:
PS. The above Sleep() seems to be ignored by WinCon. At least I 
failed to

make it sleep more than ~500 msec.

There may be a timeout on how long you can stay processing the event.

Why do you need that Sleep() call at all? I would remove  it.


When logging to the console (no '-o log-file' option), the Sleep(500) 
will make
the final "... cleanup." message stay a tiny bit longer (but barely 
readable).

Without a Sleep(), the console gets closed with only a message-beep.



Any final reviews / comments on this patch? I haven't tested it out 
for the lack of a Windows system. If there are no objections, maybe we 
can push this patch?
I would expect someone running a console application to have a console 
open. I understand the rationale when it's a beginner learning to 
program and is creating a console application, but don't really see a 
usecase for wget. How are you running wget that you get an autoclosing 
console?


Cheers




Re: [Bug-wget] Send Content-Length with POST 0 length body

2014-10-20 Thread Matthew Atkinson
On 20/10/14 16:26, Giuseppe Scrivano wrote:
> should we do this only for POST requests?  What about doing it in any
> case that "!(opt.body_data || opt.body_file)"?
> 
> Regards,
> Giuseppe

>From http://tools.ietf.org/html/rfc7230#section-3.3.2
> A user agent SHOULD send a Content-Length in a request message when
> no Transfer-Encoding is sent and the request method defines a
> meaning for an enclosed payload body.  For example, a Content-Length
> header field is normally sent in a POST request even when the value
> is 0 (indicating an empty payload body).  A user agent SHOULD
> NOT send a Content-Length header field when the request message
> does not contain a payload body and the method semantics do not
> anticipate such a body.

I would think that we should do this for POST and PUT, but nothing
else

Thanks,
Matt



Re: [Bug-wget] [Bug-Wget] Summary of Pending Patches

2014-10-20 Thread Ángel González

On 20/10/14 05:14, Darshit Shah wrote:
Maybe we could automate all of these under a single make target? Or 
else, I

intend to set up a Travis CI build every week to run these tests. What is
everyone's opinion on this?


IMHO, they should be in a script / make target (make maintainer-tests ? ☺)
Otherwise, the list of incantations is not easily discoverable and might 
even be forgotten.







Re: [Bug-wget] Bug #43324: file names don't show up correctly in the log when using --backups

2014-10-20 Thread Tim Rühsen
Am Montag, 13. Oktober 2014, 10:36:41 schrieb Tim Ruehsen:
> On Sunday 12 October 2014 10:48:05 Bartosz Szczesny wrote:
> > Hello,
> > 
> > I think https://savannah.gnu.org/bugs/?43324 is not a bug (see
> > https://savannah.gnu.org/bugs/?43324#comment2) and can be closed.
> 
> Hi Bartosz,
> 
> thanks for pointing that out. I think you are right.
> 
> The 'bug' is not a bug.
> 
> If someone needs a more verbose output regarding file name rotation, a new
> 'wishlist' bug should be opened.

I just closed this bug.

Thanks, Bartosz.

Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Send Content-Length with POST 0 length body

2014-10-20 Thread Giuseppe Scrivano
Matthew Atkinson  writes:

> diff --git a/src/ChangeLog b/src/ChangeLog
> index 1c4e2d5..447179e 100644
> --- a/src/ChangeLog
> +++ b/src/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-10-19  Matthew Atkinson  
> +
> + * http.c (gethttp): Always send Content-Length header when method is 
> post,
> + even with no post body, as some servers will reject the request 
> otherwise
> +
>  2014-05-03  Tim Ruehsen  
>  
>   * retr.c (retrieve_url): fixed memory leak
> diff --git a/src/http.c b/src/http.c
> index 4b99c17..e020682 100644
> --- a/src/http.c
> +++ b/src/http.c
> @@ -1875,6 +1875,8 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, 
> struct url *proxy,
>xstrdup (number_to_static_string 
> (body_data_size)),
>rel_value);
>  }
> +  else if (strcasecmp (opt.method, "post") == 0)
> +request_set_header (req, "Content-Length", "0", rel_none);

should we do this only for POST requests?  What about doing it in any
case that "!(opt.body_data || opt.body_file)"?

Regards,
Giuseppe



Re: [Bug-wget] wget

2014-10-20 Thread Bryan Baas
Fair enough.  Thank all for the timely replies.


-- 
Bryan Baas
Weyco IT
x1808
414 241 0499 (cell)

On 10/18/2014 10:44 AM, Yousong Zhou wrote:
> Hi, Bryan
> 
> Am 18.10.2014 21:43 schrieb "Bryan Baas"  >:
>>
>> Hi,
>>
>> I was wondering about the command output of wget.  I used a Java Runtime
>> exec and, although the wget process ended with a 0 completion code, the
>> results appeared in the error stream and not the output stream.
>>
>> As a further test, I executed the same command at the command line and
>> redirected output to a file using the > operator.  Upon completion the
>> file was empty, but the results scrolled down the screen.  This had me
>> thinking that the wget command itself is directing its regular output to
>> sderr instead of stdout.
> 
> Yes, that is the expected.  It is possible to set the output file to
> stdout with "-O -" in which case you do not want to see output of wget
> itself and the file content mangled together.
> 
>>
>> The results of the wget command, from what I could tell, weren't error
>> conditions but regular output from a successful execution.
>>
> 
> I think it is a convention that debug, informational, error, verbose
> output of unix programs be written to stderr.  However, the choice of
> redirecting stderr to whatever file descriptor users prefer is always
> available.
> 
> regards.
> 
> yousong
> 
>> Your feedback would be appreciated.
>>
>> regards,
>>
>>
>> --
>> Bryan Baas
>> Weyco IT
>> x1808
>> 414 241 0499 (cell)
>>
> 






Re: [Bug-wget] [Bug-Wget] Summary of Pending Patches

2014-10-20 Thread Giuseppe Scrivano
Darshit Shah  writes:

> I was going through my local repository and noticed that the following
> patches are currently pending. I wanted a final round of reviews on
> them so that we may merge them into mainline.
>
> I've tried to explain the context behind each patch below and have
> also resent the version of each patch I had locally to the relevant
> threads. So, I'm sorry for spamming your mailboxes.
>
> 1. Fix make distcheck for Python tests
>
> Tim discovered that the Python based test suite wasn't passing make
> distcheck due to a variety of reasons. I've fixed most of all the
> issues with the code.
> Some hacks still need to be eliminated. But in general, everything works right
> now.
>
> 2. Minor optimizations of Python tests
>
> While working with the Python test suite to fix the above mentioned issues, I
> came across and fixed some inefficiencies / cruft code in the test suite. 
> Those
> have been eliminated in this patch.

please push these two, let's have a release this weekend or by the
beginning of the next week.

Regards,
Giuseppe