Re: [Bug-wget] Wget build system.

2012-04-10 Thread illusionoflife
On Tuesday, April 10, 2012 10:48:23 PM you wrote:
> On 04/10/2012 10:34 PM, illusionoflife wrote:
> > Yes, you are right: I missed that perl module. 68/69 now.
> > One stupid question: Theese tests are meant to be run by user,
> > building from source or by developer?
> 
> Well, the more people running them, the better, but the main purpose for
> them was for the developers to assure themselves that, in the process of
> adding features and fixing bugs, they didn't break a bunch of other
> stuff. :)
> 
> -mjc

Okay. I am still confused with this great amount of 
testing systems. Perl way is not my way)
If I would like to add tests( I really want to move hash tests out of /src),
may I use check, dejagnu or autotest instead of these perl scripts?
Check and dejagnu is extra dependancy, altho  LFS builts it in first few 
steps) 
Is it any document, other that GNU coding standart, so I can understand
what I can and what should not do, to do not waste my work and do not 
disturbing expirienced developers? 
-- 
Best regards,
illusionoflife



Re: [Bug-wget] Wget build system.

2012-04-10 Thread Micah Cowan
On 04/10/2012 10:34 PM, illusionoflife wrote:
> Yes, you are right: I missed that perl module. 68/69 now.
> One stupid question: Theese tests are meant to be run by user,
> building from source or by developer? 

Well, the more people running them, the better, but the main purpose for
them was for the developers to assure themselves that, in the process of
adding features and fixing bugs, they didn't break a bunch of other
stuff. :)

-mjc



Re: [Bug-wget] Wget build system.

2012-04-10 Thread illusionoflife
On Tuesday, April 10, 2012 01:53:21 PM you wrote:
> On 05/11/2012 12:10 PM, illusionoflife wrote:
> > On Monday, April 09, 2012 09:50:00 PM Ángel González wrote:
> >> Do you have perl installed?
> > 
> > $perl --version
> > This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-
> > thread-multi
Be sure to first read the PKGBUILD and the comments on the AUR page of the 
package in question.
> 
> In addition to Perl, I believe there are a couple non-standard modules
> that are needed. Check the "use" lines in a couple tests for clues (they
> may also be mentioned in a README or some such at top-level?).
> 
> For instance, HTTP::Daemon is used, which on Debian-ish systems at least
> is provided by a package, libhttp-daemon-perl. I believe others are
> needed for FTP, and probably for SSL support.
> 
> -mjc

Yes, you are right: I missed that perl module. 68/69 now.
One stupid question: Theese tests are meant to be run by user,
building from source or by developer? 
-- 
Best regards,
illusionoflife



Re: [Bug-wget] Regular expression matching

2012-04-10 Thread Gijs van Tulder

Hi,

Here is a new version of the regular expressions patch. The new version 
combines POSIX (always, from gnulib) and PCRE (if available).


The patch adds these options:

 --accept-regex="..."
 --reject-regex="..."

 --regex-type=posix   for POSIX extended regexes (the default)
 --regex-type=pcrefor PCRE regexes (if PCRE is available)

In reference to the --match-query-string patch: since the regexes look 
at the complete URL, you can also use them to match the query string.


Regards,

Gijs
=== modified file 'ChangeLog'
--- ChangeLog	2012-03-25 11:47:53 +
+++ ChangeLog	2012-04-10 22:28:11 +
@@ -1,3 +1,8 @@
+2012-04-11  Gijs van Tulder  
+
+	* bootstrap.conf (gnulib_modules): Include module `regex'.
+	* configure.ac: Check for PCRE library.
+
 2012-03-25 Ray Satiro 
 
 	* configure.ac: Fix build under mingw when OpenSSL is used.

=== modified file 'bootstrap.conf'
--- bootstrap.conf	2012-03-20 19:41:14 +
+++ bootstrap.conf	2012-04-04 15:09:08 +
@@ -58,6 +58,7 @@
 quote
 quotearg
 recv
+regex
 select
 send
 setsockopt

=== modified file 'configure.ac'
--- configure.ac	2012-03-25 11:47:53 +
+++ configure.ac	2012-04-10 21:59:48 +
@@ -532,6 +532,18 @@
   ])
 )
 
+dnl
+dnl Check for PCRE
+dnl
+
+AC_CHECK_HEADER(pcre.h,
+AC_CHECK_LIB(pcre, pcre_compile,
+  [LIBS="${LIBS} -lpcre"
+   AC_DEFINE([HAVE_LIBPCRE], 1,
+ [Define if libpcre is available.])
+  ])
+)
+
  
 dnl Needed by src/Makefile.am
 AM_CONDITIONAL([IRI_IS_ENABLED], [test "X$iri" != "Xno"])

=== modified file 'src/ChangeLog'
--- src/ChangeLog	2012-04-01 14:30:59 +
+++ src/ChangeLog	2012-04-10 22:30:28 +
@@ -1,3 +1,12 @@
+2012-04-11  Gijs van Tulder  
+
+	* init.c: Add --accept-regex, --reject-regex and --regex-type.
+	* main.c: Likewise.
+	* options.c: Likewise.
+	* recur.c: Likewise.
+	* utils.c: Add regex-related functions.
+	* utils.h: Add regex-related functions.
+
 2012-04-01  Giuseppe Scrivano  
 
 	* gnutls.c (wgnutls_read_timeout): Ensure timer is freed.

=== modified file 'src/init.c'
--- src/init.c	2012-03-08 09:00:51 +
+++ src/init.c	2012-04-10 22:10:10 +
@@ -46,6 +46,10 @@
 # endif
 #endif
 
+#include 
+#ifdef HAVE_LIBPCRE
+# include 
+#endif
 
 #ifdef HAVE_PWD_H
 # include 
@@ -94,6 +98,7 @@
 CMD_DECLARE (cmd_spec_prefer_family);
 CMD_DECLARE (cmd_spec_progress);
 CMD_DECLARE (cmd_spec_recursive);
+CMD_DECLARE (cmd_spec_regex_type);
 CMD_DECLARE (cmd_spec_restrict_file_names);
 #ifdef HAVE_SSL
 CMD_DECLARE (cmd_spec_secure_protocol);
@@ -116,6 +121,7 @@
 } commands[] = {
   /* KEEP THIS LIST ALPHABETICALLY SORTED */
   { "accept",   &opt.accepts,   cmd_vector },
+  { "acceptregex",  &opt.acceptregex_s, cmd_string },
   { "addhostdir",   &opt.add_hostdir,   cmd_boolean },
   { "adjustextension",  &opt.adjust_extension,  cmd_boolean },
   { "alwaysrest",   &opt.always_rest,   cmd_boolean }, /* deprecated */
@@ -236,7 +242,9 @@
   { "reclevel", &opt.reclevel,  cmd_number_inf },
   { "recursive",NULL,   cmd_spec_recursive },
   { "referer",  &opt.referer,   cmd_string },
+  { "regextype",&opt.regex_type,cmd_spec_regex_type },
   { "reject",   &opt.rejects,   cmd_vector },
+  { "rejectregex",  &opt.rejectregex_s, cmd_string },
   { "relativeonly", &opt.relative_only, cmd_boolean },
   { "remoteencoding",   &opt.encoding_remote,   cmd_string },
   { "removelisting",&opt.remove_listing,cmd_boolean },
@@ -361,6 +369,8 @@
   opt.restrict_files_nonascii = false;
   opt.restrict_files_case = restrict_no_case_restriction;
 
+  opt.regex_type = regex_type_posix;
+
   opt.max_redirect = 20;
 
   opt.waitretry = 10;
@@ -1368,6 +1378,25 @@
   return true;
 }
 
+/* Validate --regex-type and set the choice.  */
+
+static bool
+cmd_spec_regex_type (const char *com, const char *val, void *place_ignored)
+{
+  static const struct decode_item choices[] = {
+{ "posix", regex_type_posix },
+#ifdef HAVE_LIBPCRE
+{ "pcre",  regex_type_pcre },
+#endif
+  };
+  int regex_type = regex_type_posix;
+  int ok = decode_string (val, choices, countof (choices), ®ex_type);
+  if (!ok)
+fprintf (stderr, _("%s: %s: Invalid value %s.\n"), exec_name, com, quote (val));
+  opt.regex_type = regex_type;
+  return ok;
+}
+
 static bool
 cmd_spec_restrict_file_names (const char *com, const char *val, void *place_ignored)
 {

=== modified file 'src/main.c'
--- src/main.c	2012-03-05 21:23:06 +
+++ src/main.c	2012-04-10 22:25:56 +
@@ -158,6 +158,7 @@
 static struct cmdline_option option_data[] =
   {
 { "accept", 'A', OPT_VALUE, "accept", -1 },
+{ "accept-regex", 0, OPT_VALUE, "acceptregex", -1 },
 { "adjust-extension", 'E', OPT_BOOLEAN, "adjustextension", -1 },
 { "append-output", 'a', OPT__APPEND_OUTPUT, NULL, required_argument },
 { "ask-passwo

Re: [Bug-wget] Concurrency and wget

2012-04-10 Thread Micah Cowan
On 04/10/2012 08:52 AM, Tim Ruehsen wrote:
> Meanwhile, I wrote a simple proof of concept (parallel dummy downloads using 
> threads, dummy downloading of chunks, etc.).
> I am at the point where I want to implement HTTP-Header metalink (RFC 6249).
> I just can't find any servers to test with... maybe you can help me out ?
> 
> Well, since there is no response to my previous post: is there any interest 
> in 
> getting that done anyway ?

There's interest, sure enough. But this concurrency stuff was meant to
be a Google Summer of Code project, so someone already getting started
on (and completing a proof-of-concept for) these things leaves us in a
bit of a weird place with regard to the current Summer of Code
applicants we're sifting through.

But perhaps you can post what you've done so far, and we can take a look
at what there is, and what remains, and whether a summer-of-code student
could adapt their needed work to fill in the gaps...

As to HTTP header stuff... I had a hunch before that no one is using it
much in practice, especially since it's a newer spec. But I'd imagine
metalinker.org might; or if not, someone there could probably point you
at a test server somewhere, or something.

-mjc



Re: [Bug-wget] Wget build system.

2012-04-10 Thread Micah Cowan
On 05/11/2012 12:10 PM, illusionoflife wrote:
> On Monday, April 09, 2012 09:50:00 PM Ángel González wrote:
>> Do you have perl installed?
> $perl --version
> This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-
> thread-multi

In addition to Perl, I believe there are a couple non-standard modules
that are needed. Check the "use" lines in a couple tests for clues (they
may also be mentioned in a README or some such at top-level?).

For instance, HTTP::Daemon is used, which on Debian-ish systems at least
is provided by a package, libhttp-daemon-perl. I believe others are
needed for FTP, and probably for SSL support.

-mjc



Re: [Bug-wget] Wget build system.

2012-04-10 Thread Ángel González
On 11/05/12 21:10, illusionoflife wrote:
> On Monday, April 09, 2012 09:50:00 PM Ángel González wrote:
>> 9 is the number of run-unit-tests, so it seems
>> all the run-px-tests are failing for you.
>> Do you have perl installed?
> $perl --version
> This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-
> thread-multi
> Arch GNU/Linux, if it matter. 
> Probably, I missed something before, after rebootstrapping from bzr, I got 
> built in separate directory, but again 9/69 on make check.
> No target make test.
>
> Also, is it manual for all these .px files? Obviliously they are tests
> using some perl framework, but I am still confused,
> how it relate with dejagnu, autotest, check. And what is canonical, 
> GNU-way testing tool?
I have the same version, it's very odd.
Does
 tests/run-px . 
work?


> Ps. I will fix time on my pc.
ntpdate pool.ntp.org is your friend.




Re: [Bug-wget] Wget build system.

2012-04-10 Thread illusionoflife
On Monday, April 09, 2012 09:50:00 PM Ángel González wrote:
> On 10/05/12 20:54, illusionoflife wrote:
> > Hello!
> > Currently I am trying to get used to
> > wget source base and found something strange.
> 
> Your mail arrived the list the first time, no need to resend it.
> I'm no expert in wget build system, though.
> Still, there's a whole month before this arrives from the future, so no
> reason to hurry :)
> 
> > When I am building from directory, other than $(top_srcdir),
> > 
> >  get configure error:
> > config.status: linking ../GNUmakefile to GNUmakefile
> > config.status: error: ../GNUmakefile: file not found
> > When I build from $(top_srcdir), build successfull, but check
> > 
> > returns this:
> >  ==
> > 
> > 69 tests were run
> > 9 PASS, 0 FAIL
> > 60 SKIP, 0 UNKNOWN
> > ==
> > 
> >  Only 9 tests run?! I find it strange.
> 
> make check, or make test in test folder, gives me:
> > =
> > 76 tests were run
> > 72 PASS, 3 FAIL
> > 1 SKIP, 0 UNKNOWN
> > =
> 
> 9 is the number of run-unit-tests, so it seems
> all the run-px-tests are failing for you.
> Do you have perl installed?
$perl --version
This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-
thread-multi
Arch GNU/Linux, if it matter. 
Probably, I missed something before, after rebootstrapping from bzr, I got 
built in separate directory, but again 9/69 on make check.
No target make test.

Also, is it manual for all these .px files? Obviliously they are tests
using some perl framework, but I am still confused,
how it relate with dejagnu, autotest, check. And what is canonical, 
GNU-way testing tool?

Ps. I will fix time on my pc.

-- 
Best regards,
illusionoflife



Re: [Bug-wget] Concurrency and wget

2012-04-10 Thread Tim Ruehsen
Meanwhile, I wrote a simple proof of concept (parallel dummy downloads using 
threads, dummy downloading of chunks, etc.).
I am at the point where I want to implement HTTP-Header metalink (RFC 6249).
I just can't find any servers to test with... maybe you can help me out ?

Well, since there is no response to my previous post: is there any interest in 
getting that done anyway ?

Tim

Am Tuesday 03 April 2012 schrieb Tim Ruehsen:
> Hi Giuseppe, hi Micah,
> 
> while couldn't sleep last night, I thought about wget and concurrency...
> 
> I had the idea of using a top-down approach to outline what wget is doing.
> Just to have a overview without struggling with the details of
> implementation. As a side effect one would have a (textual? graphical?)
> starting point for contributors to rush into the project. A chance to have
> a clear and well documented design.
> 
> Since maintenance of a flowchart is time-consuming and requires some extra
> skills and tools, pure texts in the form of a "programming language" seems
> to fit.
> 
> Here is just a beginning, let's say a basis for discussions.
> If you don't mind, I would like take part in ongoing development.
> 
> Basic wget functionality (download given URI/IRI):
> 
> main (URI) {
>   put  into 
> 
>   while  is not empty {
>   download_and_analyse(next  entry)
>   }
> }
> 
> download_and_analyse (URI) {
>   download URI to FILE
>   add URI to 
>   remove URI from 
>   scan FILE and add URIs to  if not already in 
> }
> 
> 
> Extended for simple multitasking (threaded, multi processes or even
> distributed).
> This is just one possible design for concurrent downloads.
> Maybe you have a more elegant idea.
> 
> main (URI) {
>   create  downloaders
>   put  into 
> 
>   wait for status message from downloader {
>   print status
>   if  is empty {
>   stop downloaders
>   we are done
>   }
>   }
> }
> 
> downloader {
>   wait for and allocate entry in  {
>   download_and_analyse(entry)
>   }
> }
> 
> download_and_analyse (URI) {
>   download URI to FILE
>   add URI to 
>   remove URI from 
>   scan FILE and add URIs to  if not already in 
> }
> 
> 
> Extended to download a URI from several sources in parallel.
> main and downloader stay the same, just download_and_analyse() is extended.
> 
> download_and_analyse (URI) {
>   /* download URI to FILE */
>   put  chunk entries into 
>   create  chunkloaders
>   wait for status message from chunkloader {
>   send modified status message to main
>   if  is empty {
>   stop chunk_loaders
>   end loop
>   }
>   }
> 
>   add URI to 
>   remove URI from 
>   scan FILE and add URIs to  if not already in 
> }
> 
> chunk_loader {
>   wait for and allocate entry in  {
>   download(entry)
>   remove entry from 
>   }
> }
> 
> After some iterations we should come to a point where we can make further
> decisions:
> - how to implement concurrency (threads, processes, distributed process,
> (cloud))
> - how to implement communication between tasks
> - is a wget rewrite reasonable ?
> - which existing code to recycle ?
> - creating libraries from existing code (e.g. libwget) or use external
> libraries
>   (e.g. for network stuff, parsing and creating URI/IRIs, etc.)
> - create a list of test code, especially for the library code
> - ... etc etc ...
> 
> 
> Tim



[Bug-wget] [PATCH] use empty query for filename generation

2012-04-10 Thread Tim Ruehsen
On March 29th, Alejandro Supu reported an issue on the list (Bug on latest 
wget (1.3.14).

I could reproduce his problem with the trunk version.
In url.c / url_file_name() an empty query is not used for the filename 
generation.

I wrote a patch which I forgot to put on the list. Here it comes ;-)

Tim
=== modified file 'src/ChangeLog'
--- src/ChangeLog	2012-03-25 15:49:55 +
+++ src/ChangeLog	2012-03-30 09:18:54 +
@@ -1,3 +1,7 @@
+2012-03-30  Tim Ruehsen  
+
+	* url.c: use empty query in local filenames
+
 2012-03-25  Giuseppe Scrivano  

 	* utils.c: Include .

=== modified file 'src/url.c'
--- src/url.c	2011-01-01 12:19:37 +
+++ src/url.c	2012-03-30 09:14:56 +
@@ -1502,7 +1502,7 @@
 {
   struct growable fnres;/* stands for "file name result" */

-  const char *u_file, *u_query;
+  const char *u_file;
   char *fname, *unique;
   char *index_filename = "index.html"; /* The default index file is index.html */

@@ -1561,12 +1561,11 @@
   u_file = *u->file ? u->file : index_filename;
   append_uri_pathel (u_file, u_file + strlen (u_file), false, &fnres);

-  /* Append "?query" to the file name. */
-  u_query = u->query && *u->query ? u->query : NULL;
-  if (u_query)
+  /* Append "?query" to the file name, even if empty */
+  if (u->query)
 	{
 	  append_char (FN_QUERY_SEP, &fnres);
-	  append_uri_pathel (u_query, u_query + strlen (u_query),
+	  append_uri_pathel (u->query, u->query + strlen (u->query),
 			 true, &fnres);
 	}
 }



Re: [Bug-wget] patches for TLS SNI support and --match-query-string option

2012-04-10 Thread Giuseppe Scrivano
Noël Köthe  writes:

> Hello,
>
> there are two patches in the wget bug tracker which are requested by
> some users.
> Are there any problems with them or can they be included in trunk?

the first one seems OK, I will look more carefully at the second patch
in the next days.  I am planning a release in the coming weeks, it will
be nice to include them too.

Cheers,
Giuseppe