Re: [Bug-wget] wget different from browser

2016-05-23 Thread Pär Karlsson
Hi Marco,

The page is protected by basic authentication, so I could not test it
myself.

But it seems to me the URL parameters might be malformed.

Normally each key/value parameter is separated by the '&' character, and
unless the server does its own parameter parsing it can't understand the
URL.

Now, the browser might silently expand the GET URL into a proper format and
it "just works", but wget only does exactly what you ask it to.

Try to replace each ';' in the URL to a '&' and see what happens:

http://data.icecat.biz/xml_s3/xml_server3.cgi?prod_id=4347B007,8576B064&vendor=canon&lang=IT&output=productcsv

Best regards,

/Pär
On May 24, 2016 1:26 AM, "Marco Barbieri"  wrote:

> hi , my name is Marco,
>
> i'm trying to download the csv returned from icecat.biz.
>
> example:
>
>
> http://data.icecat.biz/xml_s3/xml_server3.cgi?prod_id=4347B007,8576B064;vendor=canon;lang=IT;output=productcsv
>
> in the browser this will return a text file as csv,
> wget return an xml, which could be good enogth except for the language, i
> ask it, xml is in en :(
>
> could you help me?
>
> thanks
> Marco
>


Re: [Bug-wget] [Bug-Wget] Bug: Test-proxied-https-auth-keepalive.px does not chdir to test directory

2015-06-16 Thread Pär Karlsson
Absolutely. Out of habit, I used "perltidy -gnu" on the source files, which
mangled some lines. The attached patch retains the original formatting.

/Pär



2015-06-16 10:31 GMT+02:00 Tim Ruehsen :

> Hi Pär,
>
> many thanks for your work !
>
> But please could you send the patch without the reformatting ?
>
> Regards, Tim
>
> On Tuesday 16 June 2015 00:17:08 Pär Karlsson wrote:
> > Here is a suggested patch. I don't know if it will solve the problem, but
> > at least there should be diagnostics in the log if something fails.
> >
> > Best regards,
> >
> > /pär
> >
> > 2015-06-15 13:37 GMT+02:00 Pär Karlsson :
> > > I quickly looked at the code in question, and one obvious quick test
> would
> > > be to check the return value of the "unlink" calls (and $! / $ERRNO
> > > afterwards) at the two places in the respective .px files where
> > > 'needs-auth.txt' is handled.
> > >
> > > I've been cleaning up the two tests mentioned but so far have been
> unable
> > > to reproduce the problem with the test file being left over, and I'd be
> > > happy to come up with a patch when I'm finished.
> > >
> > > Until then, I would suggest a quick change to
> > >
> > > if (-e 'needs-auth.txt') {
> > >
> > >unlink 'needs-auth.txt' or warn "Cannot remove 'needs-auth.txt: $!";
> > >
> > > }
> > >
> > > would be enough to at least pinpoint the problem if it has to do with
> file
> > > permissions.
> > >
> > > There are some other potentially problematic constructs in the tests
> too,
> > > (the pipe()/select() calls, for instance, which _might_ cause race
> > > conditions in conjunction with the unlink() call), but this is all
> > > conjecture on my part; I have not managed to get the tests to fail yet.
> > >
> > > /Pär
> > >
> > > 2015-06-15 11:56 GMT+02:00 Darshit Shah :
> > >> The two proxied https tests:
> > >> Test-proxied-https-auth-keepalive.px
> > >> Test-proxied-https-auth.px
> > >>
> > >> Implement a second HTTPS server inside the test itself and do not use
> > >> the standard WgetTest.pm class. In these implementations, we do not
> > >> actually chdir() to the correct test directory. I've been seeing some
> > >> failures in make distcheck, because the file needs-auth.txt sometimes
> > >> is left over after the test is run. Currently, I haven't been able to
> > >> debug the issue for why the file remains around, but I guess we should
> > >> fix the test directory for it first.
> > >>
> > >> My Perl is ridiculously horrible. So if someone who knows a bit of
> > >> Perl can help, I'd be very grateful! Else, I'll sit down over the
> > >> coming weekend and make the changes.
> > >>
> > >> --
> > >> Thanking You,
> > >> Darshit Shah
>


0001-Make-tests-tell-us-more-about-failures.patch
Description: Binary data


Re: [Bug-wget] [Bug-Wget] Bug: Test-proxied-https-auth-keepalive.px does not chdir to test directory

2015-06-15 Thread Pär Karlsson
Here is a suggested patch. I don't know if it will solve the problem, but
at least there should be diagnostics in the log if something fails.

Best regards,

/pär

2015-06-15 13:37 GMT+02:00 Pär Karlsson :

> I quickly looked at the code in question, and one obvious quick test would
> be to check the return value of the "unlink" calls (and $! / $ERRNO
> afterwards) at the two places in the respective .px files where
> 'needs-auth.txt' is handled.
>
> I've been cleaning up the two tests mentioned but so far have been unable
> to reproduce the problem with the test file being left over, and I'd be
> happy to come up with a patch when I'm finished.
>
> Until then, I would suggest a quick change to
>
> if (-e 'needs-auth.txt') {
>unlink 'needs-auth.txt' or warn "Cannot remove 'needs-auth.txt: $!";
> }
>
> would be enough to at least pinpoint the problem if it has to do with file
> permissions.
>
> There are some other potentially problematic constructs in the tests too,
> (the pipe()/select() calls, for instance, which _might_ cause race
> conditions in conjunction with the unlink() call), but this is all
> conjecture on my part; I have not managed to get the tests to fail yet.
>
> /Pär
>
> 2015-06-15 11:56 GMT+02:00 Darshit Shah :
>
>> The two proxied https tests:
>> Test-proxied-https-auth-keepalive.px
>> Test-proxied-https-auth.px
>>
>> Implement a second HTTPS server inside the test itself and do not use
>> the standard WgetTest.pm class. In these implementations, we do not
>> actually chdir() to the correct test directory. I've been seeing some
>> failures in make distcheck, because the file needs-auth.txt sometimes
>> is left over after the test is run. Currently, I haven't been able to
>> debug the issue for why the file remains around, but I guess we should
>> fix the test directory for it first.
>>
>> My Perl is ridiculously horrible. So if someone who knows a bit of
>> Perl can help, I'd be very grateful! Else, I'll sit down over the
>> coming weekend and make the changes.
>>
>> --
>> Thanking You,
>> Darshit Shah
>>
>>
>
From 2e3647cc791f651a2f7bec1ef9e19300227b4326 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A4r=20Karlsson?= 
Date: Mon, 15 Jun 2015 14:34:45 +0200
Subject: [PATCH] Make tests tell us more about failures

* tests/tests/Test-proxied-https-auth-keepalive.px: check return values
  of flagged functions, general cleanups and safer calls to pipe I/O

* tests/Test-proxied-https-auth.px: check return values of flagged
  functions, general cleanups and safer calls to pipe I/O
---
 tests/Test-proxied-https-auth-keepalive.px | 184 ---
 tests/Test-proxied-https-auth.px   | 193 ++---
 2 files changed, 236 insertions(+), 141 deletions(-)

diff --git a/tests/Test-proxied-https-auth-keepalive.px b/tests/Test-proxied-https-auth-keepalive.px
index 349778a..75dc877 100755
--- a/tests/Test-proxied-https-auth-keepalive.px
+++ b/tests/Test-proxied-https-auth-keepalive.px
@@ -6,51 +6,62 @@ use strict;
 use warnings;
 
 use WgetFeature qw(https);
-use WgetTests;  # For $WGETPATH.
+use WgetTests;# For $WGETPATH.
 
 my $cert_path;
 my $key_path;
 my $srcdir;
 
-if (@ARGV) {
+if (@ARGV)
+{
 $srcdir = shift @ARGV;
-} elsif (defined $ENV{srcdir}) {
+}
+elsif (defined $ENV{srcdir})
+{
 $srcdir = $ENV{srcdir};
 }
 
-if (defined $srcdir) {
-$key_path = "$srcdir/certs/server-key.pem";
+if (defined $srcdir)
+{
+$key_path  = "$srcdir/certs/server-key.pem";
 $cert_path = "$srcdir/certs/server-cert.pem";
-} else {
-$key_path = "certs/server-key.pem";
-$cert_path = "certs/server-cert.pem";
+}
+else
+{
+$key_path  = 'certs/server-key.pem';
+$cert_path = 'certs/server-cert.pem';
 }
 
-
+use English qw(-no_match_vars);
 use HTTP::Daemon;
 use HTTP::Request;
+use IO::Handle;
 use IO::Socket::SSL;
 
-my $SOCKET = HTTP::Daemon->new (LocalAddr => 'localhost',
-ReuseAddr => 1) or die "Cannot create server!!!";
+my $SOCKET = HTTP::Daemon->new(LocalAddr => 'localhost',
+   ReuseAddr => 1)
+  or die 'Cannot create server!!!';
 
-sub get_request {
-my $conn = shift;
-my $content = '';
+sub get_request
+{
+my $conn= shift;
+my $content = q{};
 my $line;
 
-while (defined ($line = <$conn>)) {
+while (defined($line = <$conn>))
+{
 $content .= $line;
 last if $line eq "\r\n";
 }
 
 my $rqst = HTTP::Request->parse($content)
-or die "Couldn

Re: [Bug-wget] [Bug-Wget] Bug: Test-proxied-https-auth-keepalive.px does not chdir to test directory

2015-06-15 Thread Pär Karlsson
I quickly looked at the code in question, and one obvious quick test would
be to check the return value of the "unlink" calls (and $! / $ERRNO
afterwards) at the two places in the respective .px files where
'needs-auth.txt' is handled.

I've been cleaning up the two tests mentioned but so far have been unable
to reproduce the problem with the test file being left over, and I'd be
happy to come up with a patch when I'm finished.

Until then, I would suggest a quick change to

if (-e 'needs-auth.txt') {
   unlink 'needs-auth.txt' or warn "Cannot remove 'needs-auth.txt: $!";
}

would be enough to at least pinpoint the problem if it has to do with file
permissions.

There are some other potentially problematic constructs in the tests too,
(the pipe()/select() calls, for instance, which _might_ cause race
conditions in conjunction with the unlink() call), but this is all
conjecture on my part; I have not managed to get the tests to fail yet.

/Pär

2015-06-15 11:56 GMT+02:00 Darshit Shah :

> The two proxied https tests:
> Test-proxied-https-auth-keepalive.px
> Test-proxied-https-auth.px
>
> Implement a second HTTPS server inside the test itself and do not use
> the standard WgetTest.pm class. In these implementations, we do not
> actually chdir() to the correct test directory. I've been seeing some
> failures in make distcheck, because the file needs-auth.txt sometimes
> is left over after the test is run. Currently, I haven't been able to
> debug the issue for why the file remains around, but I guess we should
> fix the test directory for it first.
>
> My Perl is ridiculously horrible. So if someone who knows a bit of
> Perl can help, I'd be very grateful! Else, I'll sit down over the
> coming weekend and make the changes.
>
> --
> Thanking You,
> Darshit Shah
>
>


Re: [Bug-wget] [PATCH] Stylistic and idiomatic cleanups in Perl tests

2015-05-20 Thread Pär Karlsson
This particular piece of code is only executed if a test *fails, *in
_verify_download(). So as long as the tests succeded, this particular code
path was never executed. :-)

Best regards,

/Pär

2015-05-20 9:59 GMT+02:00 Tim Ruehsen :

> Thank you, Pär and Hubert.
>
> A very good catch !
>
> Since none of our Perl tests failed though we have broken Perl code, the
> question arises, if this is dead code (not executed when running our
> tests) !?
>
> BTW, I would like to mention that we plan to translate the Perl tests into
> Python. In the end, we want to get rid of the Perl test suite.
> So, before you start spending your precious developer time into improving
> the
> Perl test suite, please consider improving the Python test suite (and/or
> move
> tests from Perl to Python).
>
> Anyways, i'll push your patch soon if nobody complains.
>
> Regards, Tim
>
> On Tuesday 19 May 2015 22:27:24 Pär Karlsson wrote:
> > Yes, you are right. The declaration of $i inside the for expression
> shadows
> > the scope of the outer $i inside the for loop (which indeed makes the
> outer
> > $i uninitialized after the loop).
> >
> > The for loop itself is not stylistically Perlish with the C-style loop,
> but
> > that's a minor gripe. I made an error when using the Perl-style range
> > operator ( N1 .. N2 ) loop and forgot to remove the commented out line
> too.
> > And I mistakenly left the extraneous 'my' there, which declares $i in a
> new
> > scope. After some unintentional off-list conversation with Hubert - which
> > was lucky since he discovered another embarrasing mistake - my suggestion
> > would be to change it to a while loop instead (see attached patch).
> >
> > Best regards,
> >
> > /Pär
> >
> > 2015-05-19 20:04 GMT+02:00 Hubert Tarasiuk :
> > > > From: Pär Karlsson 
> > > >
> > > > A number of Perl-specific cleanups in the test suite perl modules.
> > > >
> > > > Most of these changes have no immediate functional difference but
> puts
> > >
> > > code
> > >
> > > > in line with the same formatting (perltidy GNU style) for
> readability.
> > > >
> > > > A warning regarding invalid operators on incompatible values
> (numeric vs
> > > > strings) have been fixed.
> > > >
> > > > Some common but discouraged Perl idioms have been updated to a
> somewhat
> > > > clearer style, especially regarding evals and map() vs. for loops.
> > > >
> > > > I did not touch the FTP and HTTP server modules functionally, but
> there
> > >
> > > seem
> > >
> > > > to be a lot of strange code there in, and would warrant further
> cleanups
> > > > and investigations if these tests would remain in Perl, instead of
> > > > moving
> > > > over to the Python based tests.
> > > >
> > > > All tests work nominally on my machine (with perl-5.20.1) but should
> be
> > > > compatible with versions down to 5.6.
> > >
> > > I believe I have found an issue with WgetTests module:
> > > > my $i;
> > > >
> > > > # for ($i=0; $i != $min; ++$i) {
> > > > for my $i (0 .. $min - 1)
> > > > {
> > > >
> > > > last if substr($expected, $i, 1) ne substr $actual, $i, 1;
> > > > if (substr($expected, $i, 1) eq q{\n})
> > > > {
> > > >
> > > > $line++;
> > > > $col = 0;
> > > >
> > > > }
> > > > else
> > > > {
> > > >
> > > > $col++;
> > > >
> > > > }
> > > >
> > > > }
> > > > my $snip_start = $i - ($SNIPPET_SIZE / 2);
> > > > if ($snip_start < 0)
> > > > {
> > > >
> > > > $SNIPPET_SIZE += $snip_start;# Take it from the end.
> > > > $snip_start = 0;
> > > >
> > > > }
> > >
> > > I am no Perl expert but I tested it and current for loop will not set
> > > the $i variable to any value (outside the loop). (Actually it gives me
> > > warning about using uninitialized variable when calculating $snip_start
> > > and the diff is not printed correctly.)
> > > Reverting the commented version seems to fix this problem. Removing the
> > > "my" from current for loop does not seem to fix the problem.
> > > --
> > > Hubert Tarasiuk
>


Re: [Bug-wget] [PATCH] Stylistic and idiomatic cleanups in Perl tests

2015-05-19 Thread Pär Karlsson
Yes, you are right. The declaration of $i inside the for expression shadows
the scope of the outer $i inside the for loop (which indeed makes the outer
$i uninitialized after the loop).

The for loop itself is not stylistically Perlish with the C-style loop, but
that's a minor gripe. I made an error when using the Perl-style range
operator ( N1 .. N2 ) loop and forgot to remove the commented out line too.
And I mistakenly left the extraneous 'my' there, which declares $i in a new
scope. After some unintentional off-list conversation with Hubert - which
was lucky since he discovered another embarrasing mistake - my suggestion
would be to change it to a while loop instead (see attached patch).

Best regards,

/Pär

2015-05-19 20:04 GMT+02:00 Hubert Tarasiuk :

> > From: Pär Karlsson 
> >
> > A number of Perl-specific cleanups in the test suite perl modules.
> >
> > Most of these changes have no immediate functional difference but puts
> code
> > in line with the same formatting (perltidy GNU style) for readability.
> >
> > A warning regarding invalid operators on incompatible values (numeric vs
> > strings) have been fixed.
> >
> > Some common but discouraged Perl idioms have been updated to a somewhat
> > clearer style, especially regarding evals and map() vs. for loops.
> >
> > I did not touch the FTP and HTTP server modules functionally, but there
> seem
> > to be a lot of strange code there in, and would warrant further cleanups
> > and investigations if these tests would remain in Perl, instead of moving
> > over to the Python based tests.
> >
> > All tests work nominally on my machine (with perl-5.20.1) but should be
> > compatible with versions down to 5.6.
>
> I believe I have found an issue with WgetTests module:
> > my $i;
> >
> > # for ($i=0; $i != $min; ++$i) {
> > for my $i (0 .. $min - 1)
> > {
> > last if substr($expected, $i, 1) ne substr $actual, $i, 1;
> > if (substr($expected, $i, 1) eq q{\n})
> > {
> > $line++;
> > $col = 0;
> > }
> > else
> > {
> > $col++;
> > }
> > }
> > my $snip_start = $i - ($SNIPPET_SIZE / 2);
> > if ($snip_start < 0)
> > {
> > $SNIPPET_SIZE += $snip_start;# Take it from the end.
> > $snip_start = 0;
> > }
> I am no Perl expert but I tested it and current for loop will not set
> the $i variable to any value (outside the loop). (Actually it gives me
> warning about using uninitialized variable when calculating $snip_start
> and the diff is not printed correctly.)
> Reverting the commented version seems to fix this problem. Removing the
> "my" from current for loop does not seem to fix the problem.
> --
> Hubert Tarasiuk
>
>
From 30ff711d6568a9269390ea7a8847357ba021fad9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A4r=20Karlsson?= 
Date: Tue, 19 May 2015 22:20:17 +0200
Subject: [PATCH] Fix undeclared loop variable in Perl test suite

---
 tests/WgetTests.pm | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/WgetTests.pm b/tests/WgetTests.pm
index 501aeba..34253b1 100644
--- a/tests/WgetTests.pm
+++ b/tests/WgetTests.pm
@@ -265,13 +265,12 @@ sub _show_diff
 my $min  = $explen <= $actlen ? $explen : $actlen;
 my $line = 1;
 my $col  = 1;
-my $i;
+my $i= 0;
 
-# for ($i=0; $i != $min; ++$i) {
-for my $i (0 .. $min - 1)
+while ( $i < $min )
 {
 last if substr($expected, $i, 1) ne substr $actual, $i, 1;
-if (substr($expected, $i, 1) eq q{\n})
+if (substr($expected, $i, 1) eq "\n")
 {
 $line++;
 $col = 0;
@@ -280,6 +279,7 @@ sub _show_diff
 {
 $col++;
 }
+$i++;
 }
 my $snip_start = $i - ($SNIPPET_SIZE / 2);
 if ($snip_start < 0)
-- 
2.3.6



Re: [Bug-wget] [bug #44966] wget -nv is too verbose, -q too quiet

2015-04-30 Thread Pär Karlsson
Also, at least on Unix like systems, there is the standard redirection
operators in the shell. If you don't want to see errors, just redirect
stderr to null, as in wget [...] 2>/dev/null


2015-04-30 8:59 GMT+02:00 Darshit Shah :

> Follow-up Comment #2, bug #44966 (project wget):
>
> Not really. Sometimes I expect Wget to fail and do not want it's output
> messing my script up. In other cases, I know how to handle Wget's failures
> using exit codes and do not want the output to make things ugly.
>
> I usually run wget with -q --show-progress and would hate for errors to
> clutter up the otherwise perfect screen.
>
> Also, the -q switch should really be "quiet". We do not want to print
> *anything* at all if it can be avoided to the screen.
>
> The -nv switch is think is already that sweet spot between verbose and
> quiet.
> It will print only error messages and a success message for each file it
> downloads.
>
> In my opinion, we should not change / add any more verbosity options.
>
> ___
>
> Reply to this item at:
>
>   
>
> ___
>   Message sent via/by Savannah
>   http://savannah.gnu.org/
>
>
>


Re: [Bug-wget] Configuration to ignore a specified domain, or treat it differently?

2014-11-26 Thread Pär Karlsson
Hello,

This is a kludge, but you could temporarily short-circuit the problematic
domain/hostname to localhost in your /etc/hosts file or (if on Windows) the
equivalent file (somewhere in c:\Windows\System32\ IIRC, Google should have
the answer).

/Pär



2014-11-26 1:36 GMT+01:00 Dun Peal :

> I'm using wget to mirror an old website with all of its
> --page-requisites. Unfortunately, one of the domains which used to
> serve some of these requisites - is now offline. Every time wget tries
> to get anything from that domain, it blocks for 10+ seconds until the
> connection finally times out.
>
> This issue is substantially exacerbated by the 20 retries. I want to
> keep the default retry count at 20, but I know this one particular
> domain is never going to respond.
>
> Is there a way to tell wget to ignore that domain entirely, i.e. not
> try to fetch anything from there?
>
> If not, can I at least apply different configurations to request from
> that domain, for example specify 0 retries for it?
>
> Otherwise, any other suggested solutions for this problem?
>
> Thanks, D.
>
>


Re: [Bug-wget] [PATCH 2/8] Add extern declaration for version.c strings

2014-11-25 Thread Pär Karlsson
Hi,

Attached is a patch for a missing 'version.h' in wget_SOURCES in the
src/Makefile.am, which I suppose should have been included in the above
patch by Darshit.

Without it, 'make distcheck' fails.

Best regards,

/Pär

2014-11-22 10:22 GMT+01:00 Darshit Shah :

> ---
>  src/ChangeLog   |  8 
>  src/Makefile.am |  1 +
>  src/http.c  |  2 +-
>  src/main.c  |  4 +---
>  src/version.h   | 33 +
>  src/warc.c  |  2 +-
>  6 files changed, 45 insertions(+), 5 deletions(-)
>  create mode 100644 src/version.h
>
> diff --git a/src/ChangeLog b/src/ChangeLog
> index d49e6ca..e912cf9 100644
> --- a/src/ChangeLog
> +++ b/src/ChangeLog
> @@ -1,5 +1,13 @@
>  2014-11-22  Darshit Shah  
>
> +   * version.h: New file. Add extern declarations for globally shared
> strings
> +   * Makefile.am: Have version.c import version.h
> +   * main.c: Import version.h and remove old extern declarations
> +   * http.c: Same
> +   * warc.c: Same
> +
> +2014-11-22  Darshit Shah  
> +
> * utils.c (abort_run_with_timeout): The sig parameter is not used.
> Mark it
> as such.
> (abort_run_with_timeout): One implementation of this function did
> not
> diff --git a/src/Makefile.am b/src/Makefile.am
> index a5db9fd..cfa853f 100644
> --- a/src/Makefile.am
> +++ b/src/Makefile.am
> @@ -74,6 +74,7 @@ version.c:  $(wget_SOURCES) ../lib/libgnu.a
> echo '/* version.c */' > $@
> echo '/* Autogenerated by Makefile - DO NOT EDIT */' >> $@
> echo '' >> $@
> +   echo '#include "version.h"' >> $@
> echo 'const char *version_string = "@VERSION@";' >> $@
> echo 'const char *compilation_string = "'$(COMPILE)'";' \
> | $(ESCAPEQUOTE) >> $@
> diff --git a/src/http.c b/src/http.c
> index b96d4a9..bac471d 100644
> --- a/src/http.c
> +++ b/src/http.c
> @@ -60,6 +60,7 @@ as that of the covered work.  */
>  #include "spider.h"
>  #include "warc.h"
>  #include "c-strcase.h"
> +#include "version.h"
>
>  #ifdef TESTING
>  #include "test.h"
> @@ -69,7 +70,6 @@ as that of the covered work.  */
>  # include "vms.h"
>  #endif /* def __VMS */
>
> -extern char *version_string;
>
>  /* Forward decls. */
>  struct http_stat;
> diff --git a/src/main.c b/src/main.c
> index 28c832c..56f3312 100644
> --- a/src/main.c
> +++ b/src/main.c
> @@ -54,6 +54,7 @@ as that of the covered work.  */
>  #include "http.h"   /* for save_cookies */
>  #include "ptimer.h"
>  #include "warc.h"
> +#include "version.h"
>  #include "c-strcase.h"
>  #include 
>  #include 
> @@ -79,10 +80,7 @@ struct iri dummy_iri;
>  struct options opt;
>
>  /* defined in version.c */
> -extern char *version_string;
> -extern char *compilation_string;
>  extern char *system_getrc;
> -extern char *link_string;
>  /* defined in build_info.c */
>  extern const char *compiled_features[];
>  /* Used for --version output in print_version */
> diff --git a/src/version.h b/src/version.h
> new file mode 100644
> index 000..487f73f
> --- /dev/null
> +++ b/src/version.h
> @@ -0,0 +1,33 @@
> +/* Extern declarations for printing version information
> +   Copyright (C) 2013 Free Software Foundation, Inc.
> +
> +This file is part of GNU Wget.
> +
> +GNU Wget is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3 of the License, or
> +(at your option) any later version.
> +
> +GNU Wget is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with Wget.  If not, see .
> +
> +Additional permission under GNU GPL version 3 section 7
> +
> +If you modify this program, or any covered work, by linking or
> +combining it with the OpenSSL project's OpenSSL library (or a
> +modified version of that library), containing parts covered by the
> +terms of the OpenSSL or SSLeay licenses, the Free Software Foundation
> +grants you additional permission to convey the resulting work.
> +Corresponding Source for a non-source form of such a combination
> +shall include the source code for the parts of OpenSSL used as well
> +as that of the covered work.  */
> +
> +/* Extern declarations for strings in version.c */
> +extern const char *version_string;
> +extern const char *compilation_string;
> +extern const char *link_string;
> diff --git a/src/warc.c b/src/warc.c
> index 32675f8..d61093c 100644
> --- a/src/warc.c
> +++ b/src/warc.c
> @@ -34,6 +34,7 @@ as that of the covered work.  */
>  #include "wget.h"
>  #include "hash.h"
>  #include "utils.h"
> +#include "version.h"
>
>  #include 
>  #include 
> @@ -66,7 +67,6 @@ as that of the covered work.  */
>  #define O_TEMPORARY 0
>  #endif
>
>

Re: [Bug-wget] wcat?

2014-11-19 Thread Pär Karlsson
That does seem like an crazy amount of extra work (and code) for just a
simple wrapper. :-) And really, curl would be the way to do this on most
platforms/systems.

I'd be happy to do it, though, if there is enough interest from the people
on the list?

/Pär

2014-11-20 7:34 GMT+01:00 Darshit Shah :

> On 11/19, Pär Karlsson wrote:
>
>> Or, realizing that even the simplest of shell scripts contain bugs, this
>> one is slightly better... and conforms better to GNU coding standards. :-)
>>
>> /Pär
>>
>>  Or realizing that Free Software means legal stuff, we need to modify
> this and add a legal notice about the GPL, a copyright years notice, and
> options to handle --help and --version. Because Wget's standard help cannot
> be used here.
> The script needs to eliminate any -O options passed to zcat.
>
> Also, if such a script is to go into productions systems, it needs more
> sanity checks for, as previously mentioned, other -O options among a couple
> of other things.
>
> I was just looking at the source for zcat. It's technically a one line
> shell script:
> exec gzip -cd "$@"
>
> However, the rest of cruft adds another 55 lines.
>
> And talking about legalities, I'm hoping you already have signed the
> assignment papers because otherwise that's even more work, before we can
> add this to the source. :-)
>
>
>  2014-11-19 22:50 GMT+01:00 Pär Karlsson :
>>
>>  Or, if there's any real need, may I suggest this simple patch for making
>>> a
>>> wrapper in contrib/wcat? :-9
>>>
>>> Best regards,
>>>
>>> /Pär
>>>
>>> 2014-11-19 22:34 GMT+01:00 William Tracy :
>>>
>>>  I meant to send this to the whole list, sorry.
>>>> -- Forwarded message --
>>>> From: "William Tracy" 
>>>> Date: Nov 19, 2014 1:32 PM
>>>> Subject: Re: [Bug-wget] wcat?
>>>> To: "Dagobert Michelsen" 
>>>> Cc:
>>>>
>>>> Two thoughts:
>>>>
>>>> 1. Curl already behaves this way out of the box, so the path of least
>>>> resistance here would be to just use curl. Curl lacks wget's crawling
>>>> features, but I don't see how you could use those features in this
>>>> context,
>>>> anyway.
>>>>
>>>> 2. If you are dead set on having a wcat command, you might have more
>>>> luck
>>>> getting this implemented by the package maintainer for your distribution
>>>> than by the core wget team.
>>>>
>>>> William
>>>> On Nov 19, 2014 12:41 PM, "Dagobert Michelsen"  wrote:
>>>>
>>>> > Hi Tim,
>>>> >
>>>> > > Am 17.11.2014 um 20:49 schrieb Tim Rühsen :
>>>> > >
>>>> > > Am Montag, 17. November 2014, 09:05:52 schrieb Alfred M. Szmidt:
>>>> > >> It would be nice if wget also installed a wcat command which would
>>>> > >> default to something like,
>>>> > >>
>>>> > >>  wget -o /dev/null -O - "$@"
>>>> > >>
>>>> > >> Would it be possible to add something like that?
>>>> > >
>>>> > > Something like
>>>> > >
>>>> > > $ alias wcat='wget -o /dev/null -O -'
>>>> > > $ wcat www.example.com
>>>> > >
>>>> > > ?
>>>> > >
>>>> > > Put the alias into your ~/.bash_aliases or ~/.bashrc and that's it.
>>>> > >
>>>> > > What have I missed ;-) ?
>>>> >
>>>> > I think Alfred wants to have a behaviour similar to to gzip/gzcat
>>>> where
>>>> > you could also have similar structure but nonetheless gzip ships
>>>> gzcat,
>>>> > gzgrep etc.
>>>> >
>>>> >
>>>> > Best regards
>>>> >
>>>> >   — Dago
>>>> >
>>>> > --
>>>> > "You don't become great by trying to be great, you become great by
>>>> wanting
>>>> > to do something,
>>>> > and then doing it so hard that you become great in the process." -
>>>> xkcd
>>>> > #896
>>>> >
>>>> >
>>>>
>>>>
>>>
>>>
>  From 43cb7d02d3f6f4880fc92ee2f2982e21e5e58440 Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?P=C3=A4r=20Karlsson?= 
>> Date: Wed, 19 Nov 2014 23:10:22 +0100
>> Subject: [PATCH] Added wcat convenience wrapper
>>
>> ---
>> contrib/wcat | 10 ++
>> 1 file changed, 10 insertions(+)
>> create mode 100755 contrib/wcat
>>
>> diff --git a/contrib/wcat b/contrib/wcat
>> new file mode 100755
>> index 000..3477d88
>> --- /dev/null
>> +++ b/contrib/wcat
>> @@ -0,0 +1,10 @@
>> +#!/bin/sh
>> +WGET=`which wget 2>/dev/null`
>> +if [ ! -x "${WGET}" ]; then
>> +  echo "No wget executable found in PATH"
>> +  exit 127
>> +fi
>> +
>> +"${WGET}" -q -O - "$@"
>> +
>> +# vim: tabstop=2 shiftwidth=2
>> --
>> 2.0.4
>>
>>
> --- end quoted text ---
>
> --
> Thanking You,
> Darshit Shah
>


Re: [Bug-wget] wcat?

2014-11-19 Thread Pär Karlsson
Or, realizing that even the simplest of shell scripts contain bugs, this
one is slightly better... and conforms better to GNU coding standards. :-)

/Pär

2014-11-19 22:50 GMT+01:00 Pär Karlsson :

> Or, if there's any real need, may I suggest this simple patch for making a
> wrapper in contrib/wcat? :-9
>
> Best regards,
>
> /Pär
>
> 2014-11-19 22:34 GMT+01:00 William Tracy :
>
>> I meant to send this to the whole list, sorry.
>> -- Forwarded message --
>> From: "William Tracy" 
>> Date: Nov 19, 2014 1:32 PM
>> Subject: Re: [Bug-wget] wcat?
>> To: "Dagobert Michelsen" 
>> Cc:
>>
>> Two thoughts:
>>
>> 1. Curl already behaves this way out of the box, so the path of least
>> resistance here would be to just use curl. Curl lacks wget's crawling
>> features, but I don't see how you could use those features in this
>> context,
>> anyway.
>>
>> 2. If you are dead set on having a wcat command, you might have more luck
>> getting this implemented by the package maintainer for your distribution
>> than by the core wget team.
>>
>> William
>> On Nov 19, 2014 12:41 PM, "Dagobert Michelsen"  wrote:
>>
>> > Hi Tim,
>> >
>> > > Am 17.11.2014 um 20:49 schrieb Tim Rühsen :
>> > >
>> > > Am Montag, 17. November 2014, 09:05:52 schrieb Alfred M. Szmidt:
>> > >> It would be nice if wget also installed a wcat command which would
>> > >> default to something like,
>> > >>
>> > >>  wget -o /dev/null -O - "$@"
>> > >>
>> > >> Would it be possible to add something like that?
>> > >
>> > > Something like
>> > >
>> > > $ alias wcat='wget -o /dev/null -O -'
>> > > $ wcat www.example.com
>> > >
>> > > ?
>> > >
>> > > Put the alias into your ~/.bash_aliases or ~/.bashrc and that's it.
>> > >
>> > > What have I missed ;-) ?
>> >
>> > I think Alfred wants to have a behaviour similar to to gzip/gzcat where
>> > you could also have similar structure but nonetheless gzip ships gzcat,
>> > gzgrep etc.
>> >
>> >
>> > Best regards
>> >
>> >   — Dago
>> >
>> > --
>> > "You don't become great by trying to be great, you become great by
>> wanting
>> > to do something,
>> > and then doing it so hard that you become great in the process." - xkcd
>> > #896
>> >
>> >
>>
>
>
From 43cb7d02d3f6f4880fc92ee2f2982e21e5e58440 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A4r=20Karlsson?= 
Date: Wed, 19 Nov 2014 23:10:22 +0100
Subject: [PATCH] Added wcat convenience wrapper

---
 contrib/wcat | 10 ++
 1 file changed, 10 insertions(+)
 create mode 100755 contrib/wcat

diff --git a/contrib/wcat b/contrib/wcat
new file mode 100755
index 000..3477d88
--- /dev/null
+++ b/contrib/wcat
@@ -0,0 +1,10 @@
+#!/bin/sh
+WGET=`which wget 2>/dev/null`
+if [ ! -x "${WGET}" ]; then
+  echo "No wget executable found in PATH"
+  exit 127
+fi
+
+"${WGET}" -q -O - "$@"
+
+# vim: tabstop=2 shiftwidth=2
-- 
2.0.4



Re: [Bug-wget] wcat?

2014-11-19 Thread Pär Karlsson
Or, if there's any real need, may I suggest this simple patch for making a
wrapper in contrib/wcat? :-9

Best regards,

/Pär

2014-11-19 22:34 GMT+01:00 William Tracy :

> I meant to send this to the whole list, sorry.
> -- Forwarded message --
> From: "William Tracy" 
> Date: Nov 19, 2014 1:32 PM
> Subject: Re: [Bug-wget] wcat?
> To: "Dagobert Michelsen" 
> Cc:
>
> Two thoughts:
>
> 1. Curl already behaves this way out of the box, so the path of least
> resistance here would be to just use curl. Curl lacks wget's crawling
> features, but I don't see how you could use those features in this context,
> anyway.
>
> 2. If you are dead set on having a wcat command, you might have more luck
> getting this implemented by the package maintainer for your distribution
> than by the core wget team.
>
> William
> On Nov 19, 2014 12:41 PM, "Dagobert Michelsen"  wrote:
>
> > Hi Tim,
> >
> > > Am 17.11.2014 um 20:49 schrieb Tim Rühsen :
> > >
> > > Am Montag, 17. November 2014, 09:05:52 schrieb Alfred M. Szmidt:
> > >> It would be nice if wget also installed a wcat command which would
> > >> default to something like,
> > >>
> > >>  wget -o /dev/null -O - "$@"
> > >>
> > >> Would it be possible to add something like that?
> > >
> > > Something like
> > >
> > > $ alias wcat='wget -o /dev/null -O -'
> > > $ wcat www.example.com
> > >
> > > ?
> > >
> > > Put the alias into your ~/.bash_aliases or ~/.bashrc and that's it.
> > >
> > > What have I missed ;-) ?
> >
> > I think Alfred wants to have a behaviour similar to to gzip/gzcat where
> > you could also have similar structure but nonetheless gzip ships gzcat,
> > gzgrep etc.
> >
> >
> > Best regards
> >
> >   — Dago
> >
> > --
> > "You don't become great by trying to be great, you become great by
> wanting
> > to do something,
> > and then doing it so hard that you become great in the process." - xkcd
> > #896
> >
> >
>
From 2c0b9fd4b0ca6aa81467c4cc892af8c5cdf3e0df Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A4r=20Karlsson?= 
Date: Wed, 19 Nov 2014 22:44:50 +0100
Subject: [PATCH] Added wcat wrapper

---
 contrib/wcat | 8 
 1 file changed, 8 insertions(+)
 create mode 100755 contrib/wcat

diff --git a/contrib/wcat b/contrib/wcat
new file mode 100755
index 000..af37200
--- /dev/null
+++ b/contrib/wcat
@@ -0,0 +1,8 @@
+#!/bin/sh
+WGET=`which wget`
+if [ ! -x "${WGET}" ]; then
+echo "No wget executable found in PATH"
+exit -1
+fi
+
+${WGET} -q -O - "$@"
-- 
2.0.4



Re: [Bug-wget] [PATCH] Stylistic and idiomatic cleanups in Perl tests

2014-11-03 Thread Pär Karlsson
Attached patch adresses the problem and fixes 'make distcheck' for me.
Tested under perl-5.20.1 and perl-5.8.9 (all in between should be OK, as
long as the modules below are installed).

It might be worth noting (even though the Perl test suite is slowly being
deprecated(?)) that you need two Perl dependent modules in order to make
the test suite run at all. They are usually not, AFAIK, bundled with the
standard perl on most distrubutions, or installed by default:

* HTTP::Daemon (dev-perl/HTTP-Daemon on Gentoo, libhttp-daemon-perl on
Debian, etc.)
* IO::Socket::SSL (dev-perl/IO-Socket-SSL on Gentoo, libio-socket-ssl-perl
on Debian, etc.)

In case you don't have those installed, the 'make distcheck' will fail,
might be worth noting in a README or something.

Anyway, please review and and check. Should solve the problem :-)

/Pär



2014-11-03 12:27 GMT+01:00 Tim Ruehsen :

> On Sunday 02 November 2014 21:31:50 Pär Karlsson wrote:
> > I realized I got rid of one warning only to produce another in the test
> > suite.
> >
> > The issue was the syntax of the eval statement in WgetFeature.pm that
> > essentially does what we want, but that produces a warning about using a
> > string or array in void context (meaning it should't normally do
> something,
> > except that we are doing in in an eval() ).
> >
> > Changed the eval into a proper string eval, which stops the warning.
> >
> > Sorry, about the messup!
> >
> > Best regards,
> >
> > /Pär
>
> Hi Pär,
>
> I realized that 'make distcheck' now fails. Something is broken with the
> path
> that searches modules on.
>
> It was commit 8078adee7f4be04040f12232be8698236153c1f1 (Stylistic and
> idiomatic cleanups in Perl tests).
>
> Please take a look at it, maybe you can fix it.
>
> Tim
>


0001-Fixed-finding-WgetFeature.cfg-when-in-separate-build.patch
Description: Binary data


Re: [Bug-wget] Backup authentication blog -copia de seguridad de blog con autentificacion.

2014-11-03 Thread Pär Karlsson
Hello,

Try running wget with the -v / --verbose option enabled, this will allow
you to see the details of what is happening.

The reply from the server above ("405 Method Not Allowed") refers to a HTTP
method (probably GET) which is not a allowed on a certain resource. It
could be that the blog is built on, for instance, a RESTful design, which
only allows certain HTTP methods on certain resources (such as POST, PUT,
PATCH, etc.), for instance on an URi meant only to post new messages, etc.

In this case it would mean that it is not a bug in wget, but a valid error
reply from the server. You will probably not be able to download exactly
everything on the site in that case.

/Pär

2014-11-02 19:51 GMT+01:00 pedro lomas :

>
> I'm trying to download a complete copy of a personal blog that is not
> public and therefore requires authentication blogger, by username and
> password.
>
> I used the command:
>
> -http-user=correoelectron...@gmail.com wget -http-password = password
> --mirror http://nombredelblog.blogspot.com.es/
>
>
>
> but I get this error
>
>
>
> Connecting to www.blogger.com | 74.125.195.191 |: 80 ... connected.
> HTTP request sent, awaiting response ... 302 Moved Temporarily
> Location: https://www.blogger.com/blogin.g?blogspotURL=http://.
> blogspot.com.es/ [following]
> --2014-11-02 19: 42: 17-- https://www.blogger.com/
> blogin.g?blogspotURL=http://.blogspot.com.es/
> Connecting to www.blogger.com | 74.125.195.191 |: 443 ... connected.
> HTTP request sent, awaiting response ... 405 Method Not Allowed
> 02/11/2014 19:42:18 ERROR 405: Method Not Allowed.
>
> Can anyone help me?
>
> 
>
>
>
>
>
>
>
>
>
>
>
>
> Estoy intentando descargar un copia completa de un blog personal que no es
> publico y que por tanto precisa autentificacion en blogger,  mediante
> usuario y contraseña.
>
> He utilizado el comando :
>
> wget --http-user=correoelectron...@gmail.com --http-password=contraseña
> --mirror http://nombredelblog.blogspot.com.es/
>
>
>
> pero me da este error
>
>
>
> Connecting to www.blogger.com|74.125.195.191|:80... conectado.
> Petición HTTP enviada, esperando respuesta... 302 Moved Temporarily
> Localización: https://www.blogger.com/blogin.g?blogspotURL=http://.
> blogspot.com.es/ [siguiendo]
> --2014-11-02 19:42:17--  https://www.blogger.com/
> blogin.g?blogspotURL=http://.blogspot.com.es/
> Connecting to www.blogger.com|74.125.195.191|:443... conectado.
> Petición HTTP enviada, esperando respuesta... 405 Method Not Allowed
> 2014-11-02 19:42:18 ERROR 405: Method Not Allowed.
>
> ¿Puede alguien ayudarme?
>
>
>
>
>
>
>
>
>


Re: [Bug-wget] [PATCH] Stylistic and idiomatic cleanups in Perl tests

2014-11-02 Thread Pär Karlsson
I realized I got rid of one warning only to produce another in the test
suite.

The issue was the syntax of the eval statement in WgetFeature.pm that
essentially does what we want, but that produces a warning about using a
string or array in void context (meaning it should't normally do something,
except that we are doing in in an eval() ).

Changed the eval into a proper string eval, which stops the warning.

Sorry, about the messup!

Best regards,

/Pär

2014-11-01 18:13 GMT+01:00 Darshit Shah :

> On 11/01, Pär Karlsson wrote:
>
>> Sorry, my bad. :-/ Amended in new patch attached.
>>
>
> Pushed. Thanks for the contribution!
>
>
>
> --
> Thanking You,
> Darshit Shah
>
From c189e59f18a4a9d0e2cf8a88bb19d7042031793d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A4r=20Karlsson?= 
Date: Sun, 2 Nov 2014 21:26:53 +0100
Subject: [PATCH] Fixed eval syntax that generated unnecessary warnings in
 WgetFeature.pm

---
 tests/ChangeLog  | 3 +++
 tests/WgetFeature.pm | 6 ++
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tests/ChangeLog b/tests/ChangeLog
index 5097871..e3101f7 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,3 +1,6 @@
+2014-11-02  Pär Karlsson 
+	* WgetFeature.pm: Corrected block eval to string eval
+
 2014-10-31  Pär Karlsson 
 	* WgetTests.pm: Proper conditional operators, tidied up code, idiomatic
 	improvements as per modern Perl best practices.
diff --git a/tests/WgetFeature.pm b/tests/WgetFeature.pm
index 880a238..64e35cf 100644
--- a/tests/WgetFeature.pm
+++ b/tests/WgetFeature.pm
@@ -15,10 +15,8 @@ our %SKIP_MESSAGES;
   or croak "Cannot open 'WgetFeature.cfg': $ERRNO";
 my @lines = <$fh>;
 close $fh or carp "Cannot close 'WgetFeature.cfg': $ERRNO";
-eval {
-@lines;
-1;
-} or carp "Cannot eval 'WgetFeature.cfg': $ERRNO";
+my $evalstring = join q{}, @lines;
+eval { $evalstring } or carp "Cannot eval 'WgetFeature.cfg': $ERRNO";
 }
 
 sub import
-- 
2.0.4



Re: [Bug-wget] [PATCH] Stylistic and idiomatic cleanups in Perl tests

2014-11-01 Thread Pär Karlsson
Sorry, my bad. :-/ Amended in new patch attached.

/Pär

2014-11-01 7:00 GMT+01:00 Darshit Shah :

> On 11/01, Darshit Shah wrote:
>
>> On 10/31, Pär Karlsson wrote:
>>
>>> Thanks for the tips, I was going crazy trying to make git to send the
>>> emails via git send-email, and it just ended up weird.
>>>
>>> Changed the tree into a branch and retained the WgetTests.pm name.
>>>
>>> Attaching via Gmail interface. If anything looks weird, I'll try to fix
>>> it.
>>>
>>> Tested as previously with all versions of Perl.
>>>
>>>  Ack
>>
>>  Just noticed, the ChangeLog entry needs to go into tests/ChangeLog not
> in the ChangeLog file in the root of the repository.
>
>
>> --- end quoted text ---
>>
>> --
>> Thanking You,
>> Darshit Shah
>>
>
>
> --- end quoted text ---
>
> --
> Thanking You,
> Darshit Shah
>


0001-Stylistic-and-idiomatic-cleanups-in-Perl-tests.patch
Description: Binary data


Re: [Bug-wget] [PATCH] Stylistic and idiomatic cleanups in Perl tests

2014-10-31 Thread Pär Karlsson
Thanks for the tips, I was going crazy trying to make git to send the
emails via git send-email, and it just ended up weird.

Changed the tree into a branch and retained the WgetTests.pm name.

Attaching via Gmail interface. If anything looks weird, I'll try to fix it.

Tested as previously with all versions of Perl.

Cheers,

/Pär


2014-10-31 21:31 GMT+01:00 Tim Rühsen :

> Am Freitag, 31. Oktober 2014, 18:44:03 schrieb Pär Karlsson:
> > Oh, and the test suite works on all stable versions of perl from 5.6
> > through 5.20.
> >
> > No problems with any of these versions:
> > perl-5.6.2
> > perl-5.8.9
> > perl-5.10.1
> > perl-5.12.5
> > perl-5.14.4
> > perl-5.16.3
> > perl-5.18.4
> > perl-5.20.1
> >
> > And I realize now I probably need some guidance how to format patches
> > properly for git format-patch/send-email. Sorry about the spam :-/
>
> Hi Pär,
>
> you don't spam, thanks for your contribution !
>
> And It's very good to know there is a perl expert here !
>
> Well, you cloned wget and maybe worked on the 'master' branch (but I
> guess, an
> own branch for each task is better in the long term).
> After you made your changes and tests, you commit locally with 'git commit
> -a
> -m "the commit message"'. Now simply 'git format-patch -1' and your
> complete
> last commit will be saved as a (email) patch file. The filename is
> 0001-
>
> Just attach that file to your email.
>
> Now comes the point that people on this list may ask you to amend your
> changes. Do and commit the changes, and merge these two commits into one
> (git
> rebase -i, I use 'fix' for the commit that should merge into the 'pick'
> one,
> search the web for detailed description).
> On success, you now restart with 'git format-patch -1'.
>
> Just ask if you face problems.
>
>
> Just one comment.
> Right now it is a good idea to stay with WgetTests.pm. Please don't rename
> it.
> We just moved it from WgetTest.pm to WgetTests.pm (for some technical /
> organisational reason, to avoid developers having problems).
>
>
> Regards, Tim
>
>
> >
> > Best regards,
> >
> > /Pär
> >
> > 2014-10-30 20:54 GMT+01:00 :
> > > From: Pär Karlsson 
> > >
> > > ---
> > >
> > >  tests/ChangeLog  |  12 +
> > >  tests/FTPServer.pm   | 597
> > >
> > > +++
> > >
> > >  tests/FTPTest.pm |  36 +--
> > >  tests/HTTPServer.pm  | 208 +-
> > >  tests/HTTPTest.pm|  28 +-
> > >  tests/Makefile.am|   2 +-
> > >  tests/Test-proxied-https-auth.px |   2 +-
> > >  tests/WgetFeature.pm |  41 ++-
> > >  tests/WgetTest.pm| 423 +++
> > >  tests/WgetTests.pm   | 334 --
> > >  10 files changed, 993 insertions(+), 690 deletions(-)
>


0001-Stylistic-and-idiomatic-cleanups-in-Perl-tests.patch
Description: Binary data


Re: [Bug-wget] [PATCH] Stylistic and idiomatic cleanups in Perl tests

2014-10-31 Thread Pär Karlsson
Oh, and the test suite works on all stable versions of perl from 5.6
through 5.20.

No problems with any of these versions:
perl-5.6.2
perl-5.8.9
perl-5.10.1
perl-5.12.5
perl-5.14.4
perl-5.16.3
perl-5.18.4
perl-5.20.1

And I realize now I probably need some guidance how to format patches
properly for git format-patch/send-email. Sorry about the spam :-/

Best regards,

/Pär

2014-10-30 20:54 GMT+01:00 :

> From: Pär Karlsson 
>
> ---
>  tests/ChangeLog  |  12 +
>  tests/FTPServer.pm   | 597
> +++
>  tests/FTPTest.pm |  36 +--
>  tests/HTTPServer.pm  | 208 +-
>  tests/HTTPTest.pm|  28 +-
>  tests/Makefile.am|   2 +-
>  tests/Test-proxied-https-auth.px |   2 +-
>  tests/WgetFeature.pm |  41 ++-
>  tests/WgetTest.pm| 423 +++
>  tests/WgetTests.pm   | 334 --
>  10 files changed, 993 insertions(+), 690 deletions(-)
>
> diff --git a/tests/ChangeLog b/tests/ChangeLog
> index 5f37f63..a05d65c 100644
> --- a/tests/ChangeLog
> +++ b/tests/ChangeLog
> @@ -1,3 +1,15 @@
> +2014-10-30  Pär Karlsson 
> +   * WgetTests.pm: Renamed to WgetTest.pm to match package definition
> +   * WgetTest.pm: Proper conditional operators, tidied up code,
> idiomatic
> +   improvements as per modern Perl best practices.
> +   * WgetFeature.pm: Tidied up code, idiomatic improvements for
> readability
> +   * FTPServer.pm: Tidied up code (perltidy -gnu)
> +   * FTPTest.pm: Likewise
> +   * HTTPServer.pm: Likewise
> +   * HTTPTest.pm: Likewise
> +   * Makefile.am: Track name change of WgetTests.pm => WgetTest.pm
> +   * Test-proxied-https-auth.px: Tidied up code
> +
>  2014-10-30  Mike Frysinger 
>
> * WgetFeature.pm: fix skip exit code to 77
> diff --git a/tests/FTPServer.pm b/tests/FTPServer.pm
> index 1603caa..6d8ad72 100644
> --- a/tests/FTPServer.pm
> +++ b/tests/FTPServer.pm
> @@ -19,43 +19,40 @@ my $GOT_SIGURG = 0;
>
>  # connection states
>  my %_connection_states = (
> -'NEWCONN'  => 0x01,
> -'WAIT4PWD' => 0x02,
> -'LOGGEDIN' => 0x04,
> -'TWOSOCKS' => 0x08,
> -);
> +  'NEWCONN'  => 0x01,
> +  'WAIT4PWD' => 0x02,
> +  'LOGGEDIN' => 0x04,
> +  'TWOSOCKS' => 0x08,
> + );
>
>  # subset of FTP commands supported by these server and the respective
>  # connection states in which they are allowed
>  my %_commands = (
> +
>  # Standard commands from RFC 959.
> -'CWD'  => $_connection_states{LOGGEDIN} |
> -  $_connection_states{TWOSOCKS},
> -#   'EPRT' => $_connection_states{LOGGEDIN},
> -#   'EPSV' => $_connection_states{LOGGEDIN},
> +'CWD' => $_connection_states{LOGGEDIN} |
> $_connection_states{TWOSOCKS},
> +
> +#   'EPRT' => $_connection_states{LOGGEDIN},
> +#   'EPSV' => $_connection_states{LOGGEDIN},
>  'LIST' => $_connection_states{TWOSOCKS},
> -#   'LPRT' => $_connection_states{LOGGEDIN},
> -#   'LPSV' => $_connection_states{LOGGEDIN},
> +
> +#   'LPRT' => $_connection_states{LOGGEDIN},
> +#   'LPSV' => $_connection_states{LOGGEDIN},
>  'PASS' => $_connection_states{WAIT4PWD},
>  'PASV' => $_connection_states{LOGGEDIN},
>  'PORT' => $_connection_states{LOGGEDIN},
> -'PWD'  => $_connection_states{LOGGEDIN} |
> -  $_connection_states{TWOSOCKS},
> -'QUIT' => $_connection_states{LOGGEDIN} |
> -  $_connection_states{TWOSOCKS},
> +'PWD'  => $_connection_states{LOGGEDIN} |
> $_connection_states{TWOSOCKS},
> +'QUIT' => $_connection_states{LOGGEDIN} |
> $_connection_states{TWOSOCKS},
>  'REST' => $_connection_states{TWOSOCKS},
>  'RETR' => $_connection_states{TWOSOCKS},
>  'SYST' => $_connection_states{LOGGEDIN},
> -'TYPE' => $_connection_states{LOGGEDIN} |
> -  $_connection_states{TWOSOCKS},
> +'TYPE' => $_connection_states{LOGGEDIN} |
> $_connection_states{TWOSOCKS},
>  'USER' => $_connection_states{NEWCONN},
> +
>  # From ftpexts Internet Draft.
> -'SIZE' => $_connection_states{LOGGEDIN} |

Re: [Bug-wget] [PATCH] Add valgrind testing support via ./configure

2014-10-28 Thread Pär Karlsson
Regarding the Perl test suite, I'd be happy to clean up the code a little
bit, if you think it's worth it?

There are some minor issues with the current code (such as perl producing
certain warnings regarding invalid operators, i.e: "Argument "" isn't
numeric in numeric eq (==) at WgetTests.pm line 97."), but nothing serious
(and nothing affecting the running of the test suite per se). I'd be happy,
though, to come up with a suggestion for somewhat cleaner but functionally
equivalent code.

/Pär

2014-10-28 12:34 GMT+01:00 Tim Ruehsen :

> On Monday 27 October 2014 21:59:28 Tim Rühsen wrote:
> > Am Samstag, 25. Oktober 2014, 23:40:41 schrieb Tim Rühsen:
> > > Am Sonntag, 19. Oktober 2014, 17:41:30 schrieb Darshit Shah:
> > > > On 10/09, Tim Rühsen wrote:
> > > > >> Hence, hard coding the command actually reduces the amount of
> work a
> > > > >> user
> > > > >> needs to do in order to run the tests under valgrind.
> > > > >>
> > > > >> My suggestion is that we allow the configure option, but hard code
> > > > >> the
> > > > >> valgrind command into the test suites themselves, and not leave
> them
> > > > >> environment variables.
> > > > >
> > > > >What about removing the configure option and
> > > > >if VALGRIND_TESTS is undefined or empty or "0": normal testing
> > > > >if VALGRIND_TESTS is "1": valgrind testing with hard-coded options
> > > > >else: testing with command given in VALGRIND_TESTS
> > > > >
> > > > >The above described workflow should still work, right ?
> > > > >And we could do the complete test suite with
> > > > >VALGRIND_TESTS="1" make check
> > > > >
> > > > >What do you think ?
> > > >
> > > > --- end quoted text ---
> > > >
> > > > I actually like this idea.
> > > >
> > > > case $VALGRIND_TESTS:
> > > > "", 0) Normal tests;;
> > > > 1) Hard coded valgrind string;;
> > > > *) Execute the provided command;;
> > > >
> > > > Could you please make the relevant changes to atleast the Perl based
> > > > tests
> > > > and to configure.ac? I'm currently traveling, but I'll fix the
> Python
> > > > tests
> > > > ASAP and send in a patch which will work well with the aforementioned
> > > > cases.
> > >
> > > Hi Darhsit,
> > >
> > > here is the valgrind patch. Both Perl and Python test suites amended.
> > >
> > Added the ChangeLog entries.
>
> I pushed it.
>
> Tim
>


Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-27 Thread Pär Karlsson
No complaints from me, it applied cleanly, all tests OK. :-)

2014-10-27 21:53 GMT+01:00 Tim Rühsen :

> Am Samstag, 25. Oktober 2014, 11:09:48 schrieb Pär Karlsson:
> > Good point. I got carried away by the criticism against xrealloc being
> too
> > expensive, etc, but you are of course right. Your implementation seems to
> > me the most readable, and in this context, it's what matters. :-)
> >
> > Thanks for the tip re: callgrind :-)
> >
> > /Pär
> >
> > 2014-10-24 20:39 GMT+02:00 Tim Rühsen :
> > > Pär, thanks for your work.
> > >
> > > *BUT* speed does not really matter here. My intention was to show code
> > > that is
> > > much more readable than the current implementation which is unnecessary
> > > complicated and not even understandable by the LLVM/clang analyzer.
> > > That piece of code should be replaced.
> > >
> > > FYI, if you want to work on CPU cycle optimization, use valgrind --
> > > tool=callgrind for your wget command to be polished. The resulting file
> > > can be
> > > viewed with e.g. kcachegrind.
> > >
> > > Tim
>
> i made up a patch.
>
> Any complaints ?
>
> Tim
>


Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-25 Thread Pär Karlsson
Good point. I got carried away by the criticism against xrealloc being too
expensive, etc, but you are of course right. Your implementation seems to
me the most readable, and in this context, it's what matters. :-)

Thanks for the tip re: callgrind :-)

/Pär

2014-10-24 20:39 GMT+02:00 Tim Rühsen :

> Pär, thanks for your work.
>
> *BUT* speed does not really matter here. My intention was to show code
> that is
> much more readable than the current implementation which is unnecessary
> complicated and not even understandable by the LLVM/clang analyzer.
> That piece of code should be replaced.
>
> FYI, if you want to work on CPU cycle optimization, use valgrind --
> tool=callgrind for your wget command to be polished. The resulting file
> can be
> viewed with e.g. kcachegrind.
>
> Tim
>
>
> Am Freitag, 24. Oktober 2014, 19:27:43 schrieb Pär Karlsson:
> > Well, I wrote a little benchmark and implemented them all in a hastily
> > thrown together little program.
> >
> > Basically, it tests three different strings against all three
> > implementations of the functions (but it uses malloc instead of xmalloc,
> > because I didn't want to throw in the whole of gnulib there too):
> >
> > The results from 1 million iterations on each string (on an AMD
> Athlon(tm)
> > Dual Core Processor 4850e) are as follows:
> >
> > feinorgh@elektra ~/code $ make strtest
> > cc strtest.c   -o strtest
> > feinorgh@elektra ~/code $ ./strtest
> > Result concat_strings: avg 0.0018 s, total 0.18253400 s
> > Result concat_strings_new: avg 0.0025 s, total 0.25143400 s
> > Result concat_strings_tim: avg 0.0036 s, total 0.36166900 s
> > Result concat_strings: avg 0.0015 s, total 0.15230500 s
> > Result concat_strings_new: avg 0.0022 s, total 0.22280100 s
> > Result concat_strings_tim: avg 0.0027 s, total 0.27337900 s
> > Result concat_strings: avg 0.0073 s, total 0.72672500 s
> > Result concat_strings_new: avg 0.0066 s, total 0.66113200 s
> > Result concat_strings_tim: avg 0.0148 s, total 1.47995200 s
> > feinorgh@elektra ~/code $ gcc -O3 -march=native -o strtest strtest.c
> -Wall
> > feinorgh@elektra ~/code $ ./strtest
> > Result concat_strings: avg 0.0013 s, total 0.12796300 s
> > Result concat_strings_new: avg 0.0020 s, total 0.20472700 s
> > Result concat_strings_tim: avg 0.0016 s, total 0.16481100 s
> > Result concat_strings: avg 0.0011 s, total 0.10742400 s
> > Result concat_strings_new: avg 0.0018 s, total 0.18417900 s
> > Result concat_strings_tim: avg 0.0013 s, total 0.12504500 s
> > Result concat_strings: avg 0.0059 s, total 0.58840200 s
> > Result concat_strings_new: avg 0.0054 s, total 0.53843800 s
> > Result concat_strings_tim: avg 0.0068 s, total 0.68416400 s
> > feinorgh@elektra ~/code $ gcc -g -march=native -o strtest strtest.c
> -Wall
> > feinorgh@elektra ~/code $ valgrind ./strtest
> > ==3802== Memcheck, a memory error detector
> > ==3802== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
> > ==3802== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright
> info
> > ==3802== Command: ./strtest
> > ==3802==
> > Result concat_strings: avg 0.0754 s, total 7.54378500 s
> > Result concat_strings_new: avg 0.1080 s, total 10.80041900 s
> > Result concat_strings_tim: avg 0.0900 s, total 8.99982200 s
> > Result concat_strings: avg 0.0652 s, total 6.52148200 s
> > Result concat_strings_new: avg 0.0985 s, total 9.85281200 s
> > Result concat_strings_tim: avg 0.0770 s, total 7.69649600 s
> > Result concat_strings: avg 0.2109 s, total 21.08910100 s
> > Result concat_strings_new: avg 0.2570 s, total 25.69685100 s
> > Result concat_strings_tim: avg 0.2541 s, total 25.40761200 s
> > ==3802==
> > ==3802== HEAP SUMMARY:
> > ==3802== in use at exit: 0 bytes in 0 blocks
> > ==3802==   total heap usage: 13,000,000 allocs, 13,000,000 frees,
> > 743,000,000 bytes allocated
> > ==3802==
> > ==3802== All heap blocks were freed -- no leaks are possible
> > ==3802==
> > ==3802== For counts of detected and suppressed errors, rerun with: -v
> > ==3802== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
> >
> > All three implementations perform correctly as far as I can tell (they
> give
> > identical results), and behave well regarding memory management (as long
> as
> > the allocated space is free():d afterwards).
> >
> > I tested it with gperf too, and it kind of indicated the same results,
> the

Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-24 Thread Pär Karlsson
Well, I wrote a little benchmark and implemented them all in a hastily
thrown together little program.

Basically, it tests three different strings against all three
implementations of the functions (but it uses malloc instead of xmalloc,
because I didn't want to throw in the whole of gnulib there too):

The results from 1 million iterations on each string (on an AMD Athlon(tm)
Dual Core Processor 4850e) are as follows:

feinorgh@elektra ~/code $ make strtest
cc strtest.c   -o strtest
feinorgh@elektra ~/code $ ./strtest
Result concat_strings: avg 0.0018 s, total 0.18253400 s
Result concat_strings_new: avg 0.0025 s, total 0.25143400 s
Result concat_strings_tim: avg 0.0036 s, total 0.36166900 s
Result concat_strings: avg 0.0015 s, total 0.15230500 s
Result concat_strings_new: avg 0.0022 s, total 0.22280100 s
Result concat_strings_tim: avg 0.0027 s, total 0.27337900 s
Result concat_strings: avg 0.0073 s, total 0.72672500 s
Result concat_strings_new: avg 0.0066 s, total 0.66113200 s
Result concat_strings_tim: avg 0.0148 s, total 1.47995200 s
feinorgh@elektra ~/code $ gcc -O3 -march=native -o strtest strtest.c -Wall
feinorgh@elektra ~/code $ ./strtest
Result concat_strings: avg 0.0013 s, total 0.12796300 s
Result concat_strings_new: avg 0.0020 s, total 0.20472700 s
Result concat_strings_tim: avg 0.0016 s, total 0.16481100 s
Result concat_strings: avg 0.0011 s, total 0.10742400 s
Result concat_strings_new: avg 0.0018 s, total 0.18417900 s
Result concat_strings_tim: avg 0.0013 s, total 0.12504500 s
Result concat_strings: avg 0.0059 s, total 0.58840200 s
Result concat_strings_new: avg 0.0054 s, total 0.53843800 s
Result concat_strings_tim: avg 0.0068 s, total 0.68416400 s
feinorgh@elektra ~/code $ gcc -g -march=native -o strtest strtest.c -Wall
feinorgh@elektra ~/code $ valgrind ./strtest
==3802== Memcheck, a memory error detector
==3802== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==3802== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==3802== Command: ./strtest
==3802==
Result concat_strings: avg 0.0754 s, total 7.54378500 s
Result concat_strings_new: avg 0.1080 s, total 10.80041900 s
Result concat_strings_tim: avg 0.0900 s, total 8.99982200 s
Result concat_strings: avg 0.0652 s, total 6.52148200 s
Result concat_strings_new: avg 0.0985 s, total 9.85281200 s
Result concat_strings_tim: avg 0.0770 s, total 7.69649600 s
Result concat_strings: avg 0.2109 s, total 21.08910100 s
Result concat_strings_new: avg 0.2570 s, total 25.69685100 s
Result concat_strings_tim: avg 0.2541 s, total 25.40761200 s
==3802==
==3802== HEAP SUMMARY:
==3802== in use at exit: 0 bytes in 0 blocks
==3802==   total heap usage: 13,000,000 allocs, 13,000,000 frees,
743,000,000 bytes allocated
==3802==
==3802== All heap blocks were freed -- no leaks are possible
==3802==
==3802== For counts of detected and suppressed errors, rerun with: -v
==3802== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)

All three implementations perform correctly as far as I can tell (they give
identical results), and behave well regarding memory management (as long as
the allocated space is free():d afterwards).

I tested it with gperf too, and it kind of indicated the same results, the
original implementation is the fastest.

I'm almost ashamed to attach the benchmark program, since it's so clumsily
(and copy-pastily) put together, but for completeness, I attach it anyway
(strtest.c.gz).

My conclusion is: the current implementation is the best and most
efficient; it does everything it's supposed to do, and it does it quicker
(at least on my machine) than any of the given alternatives. For 5
arguments or less, it's the most efficient implementation, and nowhere in
the current codebase is there a place where more than 5 strings are used...
to my knowledge :-) There's really nothing wrong with it, IMO :-)

Best regards,

/Pär

2014-10-23 22:01 GMT+02:00 Tim Rühsen :

> Am Dienstag, 21. Oktober 2014, 16:49:12 schrieb Yousong Zhou:
> > On 21 October 2014 16:17, Pär Karlsson  wrote:
> > > Yes, you are right, of course. Looking through the original
> implementation
> > > again, it seems water tight. clang probably complains about the
> > > uninitialized values above argcount in saved_lengths[], that are never
> > > reached.
> > >
> > > The precalculated strlen:s saved is likely only an optimization(?)
> > > attempt,
> > > I suppose.
> >
> > Yes. Grepping through the code shows that currently there is no
> > invocation of concat_strings() having more than 5 arguments.
> >
> > > Still, it seems wasteful to set up two complete loops with va_arg, and
> > > considering what this function actually does, I w

Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-21 Thread Pär Karlsson
Thanks!

It didn't even occur to me to check this out. My only excuse is gratuitous
consistency and lack of pure C experience; a malloc() without a
corresponding sizeof() seemed a little arbitrary to me, but it makes sense
now :-)

/Pär

2014-10-21 17:46 GMT+02:00 Micah Cowan :

> On Mon, Oct 20, 2014 at 7:02 PM, Yousong Zhou 
> wrote:
> > I am not sure here.  Do we always assume sizeof(char) to be 1 for
> > platforms supported by wget?
>
> FWIW, sizeof(char) is always 1 by definition; the C standard
> guarantees it. Even on systems with no addressable values smaller than
> 16 bits, because on such systems, C defines a "byte" to be 16 bits
> (recall that, originally at least, a byte isn't necessarily an octet,
> and that's the meaning the C standards use).
>
> If a platform doesn't have sizeof(char) == 1, it's not the C language. :)
>
> -mjc
>


Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-21 Thread Pär Karlsson
Yes, you are right, of course. Looking through the original implementation
again, it seems water tight. clang probably complains about the
uninitialized values above argcount in saved_lengths[], that are never
reached.

The precalculated strlen:s saved is likely only an optimization(?) attempt,
I suppose.

Still, it seems wasteful to set up two complete loops with va_arg, and
considering what this function actually does, I wonder if not s(n)printf
should be used instead of this function? :-)

Best regards,

/Pär

2014-10-21 4:22 GMT+02:00 Yousong Zhou :

> Hi, Pär
>
> On 17 October 2014 03:50, Pär Karlsson  wrote:
> > Hi, I fould a potential gotcha when playing with clang's code analysis
> tool.
> >
> > The concat_strings function silently stopped counting string lengths when
> > given more than 5 arguments. clang warned about potential garbage values
> in
> > the saved_lengths array, so I redid it with this approach.
>
> After taking a closer look, I guess the old implementation is fine.
> saved_length[] is used as a buffer for lengths of the first 5
> arguments and there is a bound check with its length.  Maybe it's a
> false-positive from clang tool?
>
> Sorry for the noise...
>
> Regards.
>
> yousong
>
> >
> > All tests working ok with this patch.
> >
> > This is my first patch to this list, by the way. I'd be happy to help out
> > more in the future.
> >
> > Best regards,
> >
> > /Pär Karlsson, Sweden
> >
> > 
> >
> > commit 2d855670e0e1fbe578506b376cdd40b0e465d3ef
> > Author: Pär Karlsson 
> > Date:   Thu Oct 16 21:41:36 2014 +0200
> >
> > Updated ChangeLog
> >
> > diff --git a/src/ChangeLog b/src/ChangeLog
> > index 1c4e2d5..1e39475 100644
> > --- a/src/ChangeLog
> > +++ b/src/ChangeLog
> > @@ -1,3 +1,8 @@
> > +2014-10-16  Pär Karlsson  
> > +
> > +   * utils.c (concat_strings): fixed arbitrary limit of 5 arguments
> to
> > +   function
> > +
> >  2014-05-03  Tim Ruehsen  
> >
> > * retr.c (retrieve_url): fixed memory leak
> >
> > commit 1fa9ff274dcb6e5a2dbbbc7d3fe2f139059c47f1
> > Author: Pär Karlsson 
> > Date:   Wed Oct 15 00:00:31 2014 +0200
> >
> > Generalized concat_strings argument length
> >
> >   The concat_strings function seemed arbitrary to only accept a
> maximum
> >   of 5 arguments (the others were silently ignored).
> >
> >   Also it had a potential garbage read of the values in the array.
> >   Updated with xmalloc/xrealloc/free
> >
> > diff --git a/src/utils.c b/src/utils.c
> > index 78c282e..93c9ddc 100644
> > --- a/src/utils.c
> > +++ b/src/utils.c
> > @@ -356,7 +356,8 @@ char *
> >  concat_strings (const char *str0, ...)
> >  {
> >va_list args;
> > -  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
> > +  size_t psize = sizeof(int);
> > +  int *saved_lengths = xmalloc (psize);
> >char *ret, *p;
> >
> >const char *next_str;
> > @@ -370,8 +371,8 @@ concat_strings (const char *str0, ...)
> >for (next_str = str0; next_str != NULL; next_str = va_arg (args, char
> *))
> >  {
> >int len = strlen (next_str);
> > -  if (argcount < countof (saved_lengths))
> > -saved_lengths[argcount++] = len;
> > +  saved_lengths[argcount++] = len;
> > +  xrealloc(saved_lengths, psize * argcount);
> >total_length += len;
> >  }
> >va_end (args);
> > @@ -393,7 +394,7 @@ concat_strings (const char *str0, ...)
> >  }
> >va_end (args);
> >*p = '\0';
> > -
> > +  free(saved_lengths);
> >return ret;
> >  }
> >  ^L
>


Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-20 Thread Pär Karlsson
Whoops, I realised I failed on the GNU coding standards, please disregard
the last one; the patch below should be better.

My apologies :-/

/Pär

diff --git a/src/ChangeLog b/src/ChangeLog
index d5aeca0..87abd85 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,8 @@
+2014-10-20 Pär Karlsson  
+
+   * utils.c (concat_strings): got rid of double loop, cleaned up
potential
+   memory corruption if concat_strings was called with more than five
args
+
 2014-10-16  Tim Ruehsen  

* url.c (url_parse): little code cleanup
diff --git a/src/utils.c b/src/utils.c
index 78c282e..5f359e0 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -356,42 +356,36 @@ char *
 concat_strings (const char *str0, ...)
 {
   va_list args;
-  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
   char *ret, *p;

   const char *next_str;
-  int total_length = 0;
-  size_t argcount;
+  size_t len;
+  size_t total_length = 0;
+  size_t charsize = sizeof (char);
+  size_t chunksize = 64;
+  size_t bufsize = 64;
+
+  p = ret = xmalloc (charsize * bufsize);

   /* Calculate the length of and allocate the resulting string. */

-  argcount = 0;
   va_start (args, str0);
   for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
 {
-  int len = strlen (next_str);
-  if (argcount < countof (saved_lengths))
-saved_lengths[argcount++] = len;
+  len = strlen (next_str);
+  if (len == 0)
+continue;
   total_length += len;
-}
-  va_end (args);
-  p = ret = xmalloc (total_length + 1);
-
-  /* Copy the strings into the allocated space. */
-
-  argcount = 0;
-  va_start (args, str0);
-  for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
-{
-  int len;
-  if (argcount < countof (saved_lengths))
-len = saved_lengths[argcount++];
-  else
-len = strlen (next_str);
+  if (total_length > bufsize)
+  {
+bufsize += chunksize;
+ret = xrealloc (ret, charsize * bufsize);
+  }
   memcpy (p, next_str, len);
   p += len;
 }
   va_end (args);
+  ret = xrealloc (ret, charsize * total_length + 1);
   *p = '\0';

   return ret;

2014-10-20 22:14 GMT+02:00 Pär Karlsson :

> Thank you for your feedback and suggestions.
>
> I thought about this during the weekend and figured it could be done much
> more efficiently, by only looping through the arguments once. Also, I
> realized the the original version would have segfaulted if called with more
> than 5 args, since the destination memory never got allocated before the
> memcpy (in the current code, it never is, though!).
>
> I cleaned up the code according to the GNU coding standards as well. The
> test suite rolls with this, and I think it looks better (although the
> function is only really called in a handful of places in all of wget).
>
> Best regards,
>
> /Pär
>
> Patch below:
>
> diff --git a/src/ChangeLog b/src/ChangeLog
> index d5aeca0..87abd85 100644
> --- a/src/ChangeLog
> +++ b/src/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-10-20 Pär Karlsson  
> +
> +   * utils.c (concat_strings): got rid of double loop, cleaned up
> potential
> +   memory corruption if concat_strings was called with more than five
> args
> +
>  2014-10-16  Tim Ruehsen  
>
> * url.c (url_parse): little code cleanup
> diff --git a/src/utils.c b/src/utils.c
> index 78c282e..dbeb9fe 100644
> --- a/src/utils.c
> +++ b/src/utils.c
> @@ -356,42 +356,36 @@ char *
>  concat_strings (const char *str0, ...)
>  {
>va_list args;
> -  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
>char *ret, *p;
>
>const char *next_str;
> -  int total_length = 0;
> -  size_t argcount;
> +  size_t len;
> +  size_t total_length = 0;
> +  size_t charsize = sizeof (char);
> +  size_t chunksize = 64;
> +  size_t bufsize = 64;
> +
> +  p = ret = xmalloc (charsize * bufsize);
>
>/* Calculate the length of and allocate the resulting string. */
>
> -  argcount = 0;
>va_start (args, str0);
>for (next_str = str0; next_str != NULL; next_str = va_arg (args, char
> *))
>  {
> -  int len = strlen (next_str);
> -  if (argcount < countof (saved_lengths))
> -saved_lengths[argcount++] = len;
> +  len = strlen (next_str);
> +  if (len == 0) {
> +  continue;
> +  }
>total_length += len;
> -}
> -  va_end (args);
> -  p = ret = xmalloc (total_length + 1);
> -
> -  /* Copy the strings into the allocated space. */
> -
> -  argcount = 0;
> -  va_start (args, str0);
> -  for (next_str = str0; next_str != NULL; next_str = va_arg (args, char
> *))
> -{
> -  int len;
> -  if (argcount < countof (saved_length

Re: [Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-20 Thread Pär Karlsson
Thank you for your feedback and suggestions.

I thought about this during the weekend and figured it could be done much
more efficiently, by only looping through the arguments once. Also, I
realized the the original version would have segfaulted if called with more
than 5 args, since the destination memory never got allocated before the
memcpy (in the current code, it never is, though!).

I cleaned up the code according to the GNU coding standards as well. The
test suite rolls with this, and I think it looks better (although the
function is only really called in a handful of places in all of wget).

Best regards,

/Pär

Patch below:

diff --git a/src/ChangeLog b/src/ChangeLog
index d5aeca0..87abd85 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,8 @@
+2014-10-20 Pär Karlsson  
+
+   * utils.c (concat_strings): got rid of double loop, cleaned up
potential
+   memory corruption if concat_strings was called with more than five
args
+
 2014-10-16  Tim Ruehsen  

* url.c (url_parse): little code cleanup
diff --git a/src/utils.c b/src/utils.c
index 78c282e..dbeb9fe 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -356,42 +356,36 @@ char *
 concat_strings (const char *str0, ...)
 {
   va_list args;
-  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
   char *ret, *p;

   const char *next_str;
-  int total_length = 0;
-  size_t argcount;
+  size_t len;
+  size_t total_length = 0;
+  size_t charsize = sizeof (char);
+  size_t chunksize = 64;
+  size_t bufsize = 64;
+
+  p = ret = xmalloc (charsize * bufsize);

   /* Calculate the length of and allocate the resulting string. */

-  argcount = 0;
   va_start (args, str0);
   for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
 {
-  int len = strlen (next_str);
-  if (argcount < countof (saved_lengths))
-saved_lengths[argcount++] = len;
+  len = strlen (next_str);
+  if (len == 0) {
+  continue;
+  }
   total_length += len;
-}
-  va_end (args);
-  p = ret = xmalloc (total_length + 1);
-
-  /* Copy the strings into the allocated space. */
-
-  argcount = 0;
-  va_start (args, str0);
-  for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
-{
-  int len;
-  if (argcount < countof (saved_lengths))
-len = saved_lengths[argcount++];
-  else
-len = strlen (next_str);
+  if (total_length > bufsize) {
+  bufsize += chunksize;
+  ret = xrealloc (ret, charsize * bufsize);
+  }
   memcpy (p, next_str, len);
   p += len;
 }
   va_end (args);
+  ret = xrealloc (ret, charsize * total_length + 1);
   *p = '\0';

   return ret;



2014-10-19 16:16 GMT+02:00 Giuseppe Scrivano :

> Pär Karlsson  writes:
>
> > Hi, I fould a potential gotcha when playing with clang's code analysis
> tool.
> >
> > The concat_strings function silently stopped counting string lengths when
> > given more than 5 arguments. clang warned about potential garbage values
> in
> > the saved_lengths array, so I redid it with this approach.
> >
> > All tests working ok with this patch.
>
> thanks for your contribution.  I've just few comments:
>
>
> > commit 2d855670e0e1fbe578506b376cdd40b0e465d3ef
> > Author: Pär Karlsson 
> > Date:   Thu Oct 16 21:41:36 2014 +0200
> >
> > Updated ChangeLog
> >
>
> we usually update the changelog in the same commit that made the change,
> so please squash these two commits into one.
>
> Also, it doesn't apply on current git master, as it seems to be based on
> a old version of git from the ChangeLog file context, could you rebase
> onto the master branch?
>
>
> > diff --git a/src/utils.c b/src/utils.c
> > index 78c282e..93c9ddc 100644
> > --- a/src/utils.c
> > +++ b/src/utils.c
> > @@ -356,7 +356,8 @@ char *
> >  concat_strings (const char *str0, ...)
> >  {
> >va_list args;
> > -  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
> > +  size_t psize = sizeof(int);
>
> please leave a space between sizeof and '(' as mandated by the GNU
> coding standards.
>
> > +  int *saved_lengths = xmalloc (psize);
> >char *ret, *p;
> >
> >const char *next_str;
> > @@ -370,8 +371,8 @@ concat_strings (const char *str0, ...)
> >for (next_str = str0; next_str != NULL; next_str = va_arg (args, char
> *))
> >  {
> >int len = strlen (next_str);
> > -  if (argcount < countof (saved_lengths))
> > -saved_lengths[argcount++] = len;
> > +  saved_lengths[argcount++] = len;
> > +  xrealloc(saved_lengths, psize * argcount);
>
> same here.
>
> >total_length += len;
> >  }
> >va_end (args);
> > @@ -393,7 +394,7 @@ concat_strings (const char *str0, ...)
> >  }
> >va_end (args);
> >*p = '\0';
> > -
> > +  free(saved_lengths);
>
> and here.
>
> Regards,
> Giuseppe
>


[Bug-wget] [PATCH] Small fix for limited number of strings (and potential garbage value) in arguments to concat_strings

2014-10-16 Thread Pär Karlsson
Hi, I fould a potential gotcha when playing with clang's code analysis tool.

The concat_strings function silently stopped counting string lengths when
given more than 5 arguments. clang warned about potential garbage values in
the saved_lengths array, so I redid it with this approach.

All tests working ok with this patch.

This is my first patch to this list, by the way. I'd be happy to help out
more in the future.

Best regards,

/Pär Karlsson, Sweden



commit 2d855670e0e1fbe578506b376cdd40b0e465d3ef
Author: Pär Karlsson 
Date:   Thu Oct 16 21:41:36 2014 +0200

Updated ChangeLog

diff --git a/src/ChangeLog b/src/ChangeLog
index 1c4e2d5..1e39475 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,8 @@
+2014-10-16  Pär Karlsson  
+
+   * utils.c (concat_strings): fixed arbitrary limit of 5 arguments to
+   function
+
 2014-05-03  Tim Ruehsen  

* retr.c (retrieve_url): fixed memory leak

commit 1fa9ff274dcb6e5a2dbbbc7d3fe2f139059c47f1
Author: Pär Karlsson 
Date:   Wed Oct 15 00:00:31 2014 +0200

Generalized concat_strings argument length

  The concat_strings function seemed arbitrary to only accept a maximum
  of 5 arguments (the others were silently ignored).

  Also it had a potential garbage read of the values in the array.
  Updated with xmalloc/xrealloc/free

diff --git a/src/utils.c b/src/utils.c
index 78c282e..93c9ddc 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -356,7 +356,8 @@ char *
 concat_strings (const char *str0, ...)
 {
   va_list args;
-  int saved_lengths[5]; /* inspired by Apache's apr_pstrcat */
+  size_t psize = sizeof(int);
+  int *saved_lengths = xmalloc (psize);
   char *ret, *p;

   const char *next_str;
@@ -370,8 +371,8 @@ concat_strings (const char *str0, ...)
   for (next_str = str0; next_str != NULL; next_str = va_arg (args, char *))
 {
   int len = strlen (next_str);
-  if (argcount < countof (saved_lengths))
-saved_lengths[argcount++] = len;
+  saved_lengths[argcount++] = len;
+  xrealloc(saved_lengths, psize * argcount);
   total_length += len;
 }
   va_end (args);
@@ -393,7 +394,7 @@ concat_strings (const char *str0, ...)
 }
   va_end (args);
   *p = '\0';
-
+  free(saved_lengths);
   return ret;
 }
 ^L