Re: [Bug-wget] Wget build system.
On Tuesday, April 10, 2012 10:48:23 PM you wrote: > On 04/10/2012 10:34 PM, illusionoflife wrote: > > Yes, you are right: I missed that perl module. 68/69 now. > > One stupid question: Theese tests are meant to be run by user, > > building from source or by developer? > > Well, the more people running them, the better, but the main purpose for > them was for the developers to assure themselves that, in the process of > adding features and fixing bugs, they didn't break a bunch of other > stuff. :) > > -mjc Okay. I am still confused with this great amount of testing systems. Perl way is not my way) If I would like to add tests( I really want to move hash tests out of /src), may I use check, dejagnu or autotest instead of these perl scripts? Check and dejagnu is extra dependancy, altho LFS builts it in first few steps) Is it any document, other that GNU coding standart, so I can understand what I can and what should not do, to do not waste my work and do not disturbing expirienced developers? -- Best regards, illusionoflife
Re: [Bug-wget] Wget build system.
On 04/10/2012 10:34 PM, illusionoflife wrote: > Yes, you are right: I missed that perl module. 68/69 now. > One stupid question: Theese tests are meant to be run by user, > building from source or by developer? Well, the more people running them, the better, but the main purpose for them was for the developers to assure themselves that, in the process of adding features and fixing bugs, they didn't break a bunch of other stuff. :) -mjc
Re: [Bug-wget] Wget build system.
On Tuesday, April 10, 2012 01:53:21 PM you wrote: > On 05/11/2012 12:10 PM, illusionoflife wrote: > > On Monday, April 09, 2012 09:50:00 PM Ángel González wrote: > >> Do you have perl installed? > > > > $perl --version > > This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux- > > thread-multi Be sure to first read the PKGBUILD and the comments on the AUR page of the package in question. > > In addition to Perl, I believe there are a couple non-standard modules > that are needed. Check the "use" lines in a couple tests for clues (they > may also be mentioned in a README or some such at top-level?). > > For instance, HTTP::Daemon is used, which on Debian-ish systems at least > is provided by a package, libhttp-daemon-perl. I believe others are > needed for FTP, and probably for SSL support. > > -mjc Yes, you are right: I missed that perl module. 68/69 now. One stupid question: Theese tests are meant to be run by user, building from source or by developer? -- Best regards, illusionoflife
Re: [Bug-wget] Regular expression matching
Hi, Here is a new version of the regular expressions patch. The new version combines POSIX (always, from gnulib) and PCRE (if available). The patch adds these options: --accept-regex="..." --reject-regex="..." --regex-type=posix for POSIX extended regexes (the default) --regex-type=pcrefor PCRE regexes (if PCRE is available) In reference to the --match-query-string patch: since the regexes look at the complete URL, you can also use them to match the query string. Regards, Gijs === modified file 'ChangeLog' --- ChangeLog 2012-03-25 11:47:53 + +++ ChangeLog 2012-04-10 22:28:11 + @@ -1,3 +1,8 @@ +2012-04-11 Gijs van Tulder + + * bootstrap.conf (gnulib_modules): Include module `regex'. + * configure.ac: Check for PCRE library. + 2012-03-25 Ray Satiro * configure.ac: Fix build under mingw when OpenSSL is used. === modified file 'bootstrap.conf' --- bootstrap.conf 2012-03-20 19:41:14 + +++ bootstrap.conf 2012-04-04 15:09:08 + @@ -58,6 +58,7 @@ quote quotearg recv +regex select send setsockopt === modified file 'configure.ac' --- configure.ac 2012-03-25 11:47:53 + +++ configure.ac 2012-04-10 21:59:48 + @@ -532,6 +532,18 @@ ]) ) +dnl +dnl Check for PCRE +dnl + +AC_CHECK_HEADER(pcre.h, +AC_CHECK_LIB(pcre, pcre_compile, + [LIBS="${LIBS} -lpcre" + AC_DEFINE([HAVE_LIBPCRE], 1, + [Define if libpcre is available.]) + ]) +) + dnl Needed by src/Makefile.am AM_CONDITIONAL([IRI_IS_ENABLED], [test "X$iri" != "Xno"]) === modified file 'src/ChangeLog' --- src/ChangeLog 2012-04-01 14:30:59 + +++ src/ChangeLog 2012-04-10 22:30:28 + @@ -1,3 +1,12 @@ +2012-04-11 Gijs van Tulder + + * init.c: Add --accept-regex, --reject-regex and --regex-type. + * main.c: Likewise. + * options.c: Likewise. + * recur.c: Likewise. + * utils.c: Add regex-related functions. + * utils.h: Add regex-related functions. + 2012-04-01 Giuseppe Scrivano * gnutls.c (wgnutls_read_timeout): Ensure timer is freed. === modified file 'src/init.c' --- src/init.c 2012-03-08 09:00:51 + +++ src/init.c 2012-04-10 22:10:10 + @@ -46,6 +46,10 @@ # endif #endif +#include +#ifdef HAVE_LIBPCRE +# include +#endif #ifdef HAVE_PWD_H # include @@ -94,6 +98,7 @@ CMD_DECLARE (cmd_spec_prefer_family); CMD_DECLARE (cmd_spec_progress); CMD_DECLARE (cmd_spec_recursive); +CMD_DECLARE (cmd_spec_regex_type); CMD_DECLARE (cmd_spec_restrict_file_names); #ifdef HAVE_SSL CMD_DECLARE (cmd_spec_secure_protocol); @@ -116,6 +121,7 @@ } commands[] = { /* KEEP THIS LIST ALPHABETICALLY SORTED */ { "accept", &opt.accepts, cmd_vector }, + { "acceptregex", &opt.acceptregex_s, cmd_string }, { "addhostdir", &opt.add_hostdir, cmd_boolean }, { "adjustextension", &opt.adjust_extension, cmd_boolean }, { "alwaysrest", &opt.always_rest, cmd_boolean }, /* deprecated */ @@ -236,7 +242,9 @@ { "reclevel", &opt.reclevel, cmd_number_inf }, { "recursive",NULL, cmd_spec_recursive }, { "referer", &opt.referer, cmd_string }, + { "regextype",&opt.regex_type,cmd_spec_regex_type }, { "reject", &opt.rejects, cmd_vector }, + { "rejectregex", &opt.rejectregex_s, cmd_string }, { "relativeonly", &opt.relative_only, cmd_boolean }, { "remoteencoding", &opt.encoding_remote, cmd_string }, { "removelisting",&opt.remove_listing,cmd_boolean }, @@ -361,6 +369,8 @@ opt.restrict_files_nonascii = false; opt.restrict_files_case = restrict_no_case_restriction; + opt.regex_type = regex_type_posix; + opt.max_redirect = 20; opt.waitretry = 10; @@ -1368,6 +1378,25 @@ return true; } +/* Validate --regex-type and set the choice. */ + +static bool +cmd_spec_regex_type (const char *com, const char *val, void *place_ignored) +{ + static const struct decode_item choices[] = { +{ "posix", regex_type_posix }, +#ifdef HAVE_LIBPCRE +{ "pcre", regex_type_pcre }, +#endif + }; + int regex_type = regex_type_posix; + int ok = decode_string (val, choices, countof (choices), ®ex_type); + if (!ok) +fprintf (stderr, _("%s: %s: Invalid value %s.\n"), exec_name, com, quote (val)); + opt.regex_type = regex_type; + return ok; +} + static bool cmd_spec_restrict_file_names (const char *com, const char *val, void *place_ignored) { === modified file 'src/main.c' --- src/main.c 2012-03-05 21:23:06 + +++ src/main.c 2012-04-10 22:25:56 + @@ -158,6 +158,7 @@ static struct cmdline_option option_data[] = { { "accept", 'A', OPT_VALUE, "accept", -1 }, +{ "accept-regex", 0, OPT_VALUE, "acceptregex", -1 }, { "adjust-extension", 'E', OPT_BOOLEAN, "adjustextension", -1 }, { "append-output", 'a', OPT__APPEND_OUTPUT, NULL, required_argument }, { "ask-passwo
Re: [Bug-wget] Concurrency and wget
On 04/10/2012 08:52 AM, Tim Ruehsen wrote: > Meanwhile, I wrote a simple proof of concept (parallel dummy downloads using > threads, dummy downloading of chunks, etc.). > I am at the point where I want to implement HTTP-Header metalink (RFC 6249). > I just can't find any servers to test with... maybe you can help me out ? > > Well, since there is no response to my previous post: is there any interest > in > getting that done anyway ? There's interest, sure enough. But this concurrency stuff was meant to be a Google Summer of Code project, so someone already getting started on (and completing a proof-of-concept for) these things leaves us in a bit of a weird place with regard to the current Summer of Code applicants we're sifting through. But perhaps you can post what you've done so far, and we can take a look at what there is, and what remains, and whether a summer-of-code student could adapt their needed work to fill in the gaps... As to HTTP header stuff... I had a hunch before that no one is using it much in practice, especially since it's a newer spec. But I'd imagine metalinker.org might; or if not, someone there could probably point you at a test server somewhere, or something. -mjc
Re: [Bug-wget] Wget build system.
On 05/11/2012 12:10 PM, illusionoflife wrote: > On Monday, April 09, 2012 09:50:00 PM Ángel González wrote: >> Do you have perl installed? > $perl --version > This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux- > thread-multi In addition to Perl, I believe there are a couple non-standard modules that are needed. Check the "use" lines in a couple tests for clues (they may also be mentioned in a README or some such at top-level?). For instance, HTTP::Daemon is used, which on Debian-ish systems at least is provided by a package, libhttp-daemon-perl. I believe others are needed for FTP, and probably for SSL support. -mjc
Re: [Bug-wget] Wget build system.
On 11/05/12 21:10, illusionoflife wrote: > On Monday, April 09, 2012 09:50:00 PM Ángel González wrote: >> 9 is the number of run-unit-tests, so it seems >> all the run-px-tests are failing for you. >> Do you have perl installed? > $perl --version > This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux- > thread-multi > Arch GNU/Linux, if it matter. > Probably, I missed something before, after rebootstrapping from bzr, I got > built in separate directory, but again 9/69 on make check. > No target make test. > > Also, is it manual for all these .px files? Obviliously they are tests > using some perl framework, but I am still confused, > how it relate with dejagnu, autotest, check. And what is canonical, > GNU-way testing tool? I have the same version, it's very odd. Does tests/run-px . work? > Ps. I will fix time on my pc. ntpdate pool.ntp.org is your friend.
Re: [Bug-wget] Wget build system.
On Monday, April 09, 2012 09:50:00 PM Ángel González wrote: > On 10/05/12 20:54, illusionoflife wrote: > > Hello! > > Currently I am trying to get used to > > wget source base and found something strange. > > Your mail arrived the list the first time, no need to resend it. > I'm no expert in wget build system, though. > Still, there's a whole month before this arrives from the future, so no > reason to hurry :) > > > When I am building from directory, other than $(top_srcdir), > > > > get configure error: > > config.status: linking ../GNUmakefile to GNUmakefile > > config.status: error: ../GNUmakefile: file not found > > When I build from $(top_srcdir), build successfull, but check > > > > returns this: > > == > > > > 69 tests were run > > 9 PASS, 0 FAIL > > 60 SKIP, 0 UNKNOWN > > == > > > > Only 9 tests run?! I find it strange. > > make check, or make test in test folder, gives me: > > = > > 76 tests were run > > 72 PASS, 3 FAIL > > 1 SKIP, 0 UNKNOWN > > = > > 9 is the number of run-unit-tests, so it seems > all the run-px-tests are failing for you. > Do you have perl installed? $perl --version This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux- thread-multi Arch GNU/Linux, if it matter. Probably, I missed something before, after rebootstrapping from bzr, I got built in separate directory, but again 9/69 on make check. No target make test. Also, is it manual for all these .px files? Obviliously they are tests using some perl framework, but I am still confused, how it relate with dejagnu, autotest, check. And what is canonical, GNU-way testing tool? Ps. I will fix time on my pc. -- Best regards, illusionoflife
Re: [Bug-wget] Concurrency and wget
Meanwhile, I wrote a simple proof of concept (parallel dummy downloads using threads, dummy downloading of chunks, etc.). I am at the point where I want to implement HTTP-Header metalink (RFC 6249). I just can't find any servers to test with... maybe you can help me out ? Well, since there is no response to my previous post: is there any interest in getting that done anyway ? Tim Am Tuesday 03 April 2012 schrieb Tim Ruehsen: > Hi Giuseppe, hi Micah, > > while couldn't sleep last night, I thought about wget and concurrency... > > I had the idea of using a top-down approach to outline what wget is doing. > Just to have a overview without struggling with the details of > implementation. As a side effect one would have a (textual? graphical?) > starting point for contributors to rush into the project. A chance to have > a clear and well documented design. > > Since maintenance of a flowchart is time-consuming and requires some extra > skills and tools, pure texts in the form of a "programming language" seems > to fit. > > Here is just a beginning, let's say a basis for discussions. > If you don't mind, I would like take part in ongoing development. > > Basic wget functionality (download given URI/IRI): > > main (URI) { > put into > > while is not empty { > download_and_analyse(next entry) > } > } > > download_and_analyse (URI) { > download URI to FILE > add URI to > remove URI from > scan FILE and add URIs to if not already in > } > > > Extended for simple multitasking (threaded, multi processes or even > distributed). > This is just one possible design for concurrent downloads. > Maybe you have a more elegant idea. > > main (URI) { > create downloaders > put into > > wait for status message from downloader { > print status > if is empty { > stop downloaders > we are done > } > } > } > > downloader { > wait for and allocate entry in { > download_and_analyse(entry) > } > } > > download_and_analyse (URI) { > download URI to FILE > add URI to > remove URI from > scan FILE and add URIs to if not already in > } > > > Extended to download a URI from several sources in parallel. > main and downloader stay the same, just download_and_analyse() is extended. > > download_and_analyse (URI) { > /* download URI to FILE */ > put chunk entries into > create chunkloaders > wait for status message from chunkloader { > send modified status message to main > if is empty { > stop chunk_loaders > end loop > } > } > > add URI to > remove URI from > scan FILE and add URIs to if not already in > } > > chunk_loader { > wait for and allocate entry in { > download(entry) > remove entry from > } > } > > After some iterations we should come to a point where we can make further > decisions: > - how to implement concurrency (threads, processes, distributed process, > (cloud)) > - how to implement communication between tasks > - is a wget rewrite reasonable ? > - which existing code to recycle ? > - creating libraries from existing code (e.g. libwget) or use external > libraries > (e.g. for network stuff, parsing and creating URI/IRIs, etc.) > - create a list of test code, especially for the library code > - ... etc etc ... > > > Tim
[Bug-wget] [PATCH] use empty query for filename generation
On March 29th, Alejandro Supu reported an issue on the list (Bug on latest wget (1.3.14). I could reproduce his problem with the trunk version. In url.c / url_file_name() an empty query is not used for the filename generation. I wrote a patch which I forgot to put on the list. Here it comes ;-) Tim === modified file 'src/ChangeLog' --- src/ChangeLog 2012-03-25 15:49:55 + +++ src/ChangeLog 2012-03-30 09:18:54 + @@ -1,3 +1,7 @@ +2012-03-30 Tim Ruehsen + + * url.c: use empty query in local filenames + 2012-03-25 Giuseppe Scrivano * utils.c: Include . === modified file 'src/url.c' --- src/url.c 2011-01-01 12:19:37 + +++ src/url.c 2012-03-30 09:14:56 + @@ -1502,7 +1502,7 @@ { struct growable fnres;/* stands for "file name result" */ - const char *u_file, *u_query; + const char *u_file; char *fname, *unique; char *index_filename = "index.html"; /* The default index file is index.html */ @@ -1561,12 +1561,11 @@ u_file = *u->file ? u->file : index_filename; append_uri_pathel (u_file, u_file + strlen (u_file), false, &fnres); - /* Append "?query" to the file name. */ - u_query = u->query && *u->query ? u->query : NULL; - if (u_query) + /* Append "?query" to the file name, even if empty */ + if (u->query) { append_char (FN_QUERY_SEP, &fnres); - append_uri_pathel (u_query, u_query + strlen (u_query), + append_uri_pathel (u->query, u->query + strlen (u->query), true, &fnres); } }
Re: [Bug-wget] patches for TLS SNI support and --match-query-string option
Noël Köthe writes: > Hello, > > there are two patches in the wget bug tracker which are requested by > some users. > Are there any problems with them or can they be included in trunk? the first one seems OK, I will look more carefully at the second patch in the next days. I am planning a release in the coming weeks, it will be nice to include them too. Cheers, Giuseppe