Greetings Gilles + all,
Yes, I agree that we need a more "polished" patch for the
distribution. I still like my intermediate path: If *any* server
blocks or URL blocks are used, then the user takes the performance
hit and re-parses each time. If *no* server/URL blocks are used, we
use Chris's patch. This should be just as fast as Chris's patch (in
the "3.1-compatibly mode" without server/URL blocks), and just as
flexible as the current status (if blocks are used). If that can get
ht://Dig fast enough to get into sarge, then I suggest we implement
it first, and then work on Gilles's more complete solution at more
leisure.
A first hack at this (not even compile-tested) is attached, patched
relative to Chris's patched version, so you can see what I mean. If
people are in favour, I'll try to work on it over the weekend.
One issue with caching input strings is that we would have to have
some sort of cache-flushing, or just let the storage grow as HtRegEx
is called repeatedly.
Cheers,
Lachlan
On Wed, 21 Apr 2004 07:45 am, Gilles Detillieux wrote:
> Hi, Chris and other developers. The problem with this fix is that
> exclude_urls and bad_querystr can no longer be used in server
> blocks or URL blocks, as they'll only be parsed once regardless of
> how they're used.
--
[EMAIL PROTECTED]
ht://Dig developer DownUnder (http://www.htdig.org)
--- htcommon/conf_parser.h 2003-09-26 22:22:57.000000000 +1000
+++ htcommon/conf_parser.h 2004-04-21 22:56:50.000000000 +1000
@@ -71,3 +71,4 @@
+extern bool config_server_URL_blocks;
--- htcommon/conf_parser.cxx 2003-11-22 15:15:40.000000000 +1100
+++ htcommon/conf_parser.cxx 2004-04-21 22:56:32.000000000 +1000
@@ -99,6 +99,8 @@
#include "htconfig.h"
#endif /* HAVE_CONFIG_H */
+bool config_server_URL_blocks = false;
+
/* Bison version > 1.25 needed */
/* TODO:
1. Better error handling
@@ -1131,6 +1133,7 @@
case 11:
{
+ config_server_URL_blocks=true;
// check if "<param> ... </param>" are equal
if (strcmp(yyvsp[-10].str,yyvsp[-2].str)!=0) {
// todo: setup error string, return with error.
--- htdig/Retriever.cc 2004-04-21 22:58:07.000000000 +1000
+++ htdig/Retriever.cc 2004-04-21 22:58:39.000000000 +1000
@@ -996,7 +996,7 @@
// mark it as invalid
//
- if(!(exclude_parsed)){
+ if(config_server_URL_blocks || !(exclude_parsed)){
//only parse this once and store into global variable
tmpList.Destroy();
tmpList.Create(config->Find(&aUrl, "exclude_urls"), " \t");
@@ -1016,7 +1016,7 @@
// mark it as invalid
//
- if(!(badquerystr_parsed)){
+ if(config_server_URL_blocks || !(badquerystr_parsed)){
//only parse this once and store into global variable
tmpList.Destroy();
tmpList.Create(config->Find(&aUrl, "bad_querystr"), " \t");