Hi, Marc. Thank you, thank you, and thank you for your 3 patches.
A couple more thank yous as well for the 3rd patch, because 1) it also
points out a bug fix for the 3.1.5 code, and 2) it pointed me in the
direction of a bug I introduced in Mike Grommet's date range handling
code. I ended up moving the localtime() call up above the gmtime() call,
to prevent it from potentially clobbering the data returned by gmtime()
(both return a pointer to a static structure) before it's used.
According to Marc Pohl:
> while testing the b3 release aof htdig-3.2.0 i found several errors in
> htdig. (all patches are against the latest snapshot build)
...
> if you init a URL with a empty string "" then the result of _url is
> "http:///". this happens for example while constructing http-referer
> headers for update-digs. the patch is to ensure that the _host is
> set if we are constructing non-file urls.
>
> diff -ur htdig-3.2.0b4-061701-orig/htcommon/URL.cc htdig-3.2.0b4-
> 061701/htcommon/URL.cc
> htdig-3.2.0b4-061701-orig/htcommon/URL.cc Sun May 20 09:13:46 2001
> +++ htdig-3.2.0b4-061701/htcommon/URL.cc Thu Jun 21 07:44:59 2001
> @@ -701,7 +701,12 @@
> // Also ensures the port number is correct for the service
> //
> void URL::constructURL()
> -{
> +{
> + if (strcmp((char*)_service, "file") != 0 && _host.length() == 0) {
> + _url = "";
> + return;
> + }
> +
> _url = _service;
> _url << ":";
>
I'm a little bit concerned about this patch, though. What happens if
the _host is an empty string, but _path, or _user, or _port aren't empty
strings. In these cases, shouldn't constructURL() set _url to something
more than just an empty string? I've committed your patch to CVS already,
but it occurred to me when looking at the 3.1.5 code for similar problems
that this patch may introduce a problem. Can anyone more familiar with
this code comment?
> i'm using alternate workfiles for my search-engine. if i also use md5
> signatures then the second initial dig will fail, because htdig uses the
> existing md5-database and no alternate workfile.
>
> diff -ur htdig-3.2.0b4-061701-orig/htdig/htdig.cc htdig-3.2.0b4-
> 061701/htdig/htdig.cc
> htdig-3.2.0b4-061701-orig/htdig/htdig.cc Sun May 20 09:13:51 2001
> +++ htdig-3.2.0b4-061701/htdig/htdig.cc Thu Jun 21 07:53:44 2001
> @@ -194,6 +194,13 @@
> configValue << ".work";
> config->Add("doc_excerpt", configValue);
> }
> +
> + configValue = config->Find("md5_db");
> + if (configValue.length() != 0)
> + {
> + configValue << ".work";
> + config->Add("md5_db", configValue);
> + }
> }
>
>
Looks good! It'll be in the next snapshot.
> my last patch repairs the broken date_factor calculation. the problem is
> that the multiplication of the timestamp with 1000 produces an overflow
> on 32 bit systems. so this should be done with floating point arithmetic.
> another small performance patch is to move the call of time(0) out of the
> loop to avoid frequent calling of time().
>
> diff -ur htdig-3.2.0b4-061701-orig/htsearch/Display.cc htdig-3.2.0b4-
> 061701/htsearch/Display.cc
> htdig-3.2.0b4-061701-orig/htsearch/Display.cc Sun Jun 10 09:13:56 2001
> +++ htdig-3.2.0b4-061701/htsearch/Display.cc Thu Jun 21 09:24:13 2001
> @@ -1190,12 +1190,13 @@
> (config->Value("endday")) ||
> (config->Value("endyear")));
>
> + time_t now = time((time_t *)0); // fill in all fields for mktime
> +
> // find the end of time
> endoftime = gmtime(&eternity);
>
> if(dategiven) // user specified some sort of date information
> {
> - time_t now = time((time_t *)0); // fill in all fields for mktime
> tm *lt = localtime(&now); // - Gilles's fix
> startdate = *lt;
> enddate = *lt;
> @@ -1424,7 +1425,7 @@
> if (date_factor != 0.0)
> {
> date_score = date_factor *
> - ((thisRef->DocTime() * 1000 / (double)time(0)) - 900);
> + ((thisRef->DocTime() * 1000.0 / (double)now) - 900);
> score += date_score;
> }
>
Here's what I ended up committing to CVS for 3.2.0b4:
Index: htsearch/Display.cc
===================================================================
RCS file: /cvsroot/htdig/htdig/htsearch/Display.cc,v
retrieving revision 1.100.2.37
retrieving revision 1.100.2.38
diff -u -p -r1.100.2.37 -r1.100.2.38
--- htsearch/Display.cc 2001/06/19 22:04:02 1.100.2.37
+++ htsearch/Display.cc 2001/06/22 20:51:50 1.100.2.38
@@ -9,7 +9,7 @@
// or the GNU Public License version 2 or later
// <http://www.gnu.org/copyleft/gpl.html>
//
-// $Id: Display.cc,v 1.100.2.37 2001/06/19 22:04:02 grdetil Exp $
+// $Id: Display.cc,v 1.100.2.38 2001/06/22 20:51:50 grdetil Exp $
//
#ifdef HAVE_CONFIG_H
@@ -1174,6 +1174,10 @@ Display::buildMatchList()
tm startdate; // structure to hold the startdate specified by the user
tm enddate; // structure to hold the enddate specified by the user
+ time_t now = time((time_t *)0); // fill in all fields for mktime
+ tm *lt = localtime(&now); // - Gilles's fix
+ startdate = *lt;
+ enddate = *lt;
time_t eternity = ~(1<<(sizeof(time_t)*8-1)); // will be the largest value
holdable by a time_t
tm *endoftime; // the time_t eternity will be converted into a tm, held by
this variable
@@ -1195,11 +1199,6 @@ Display::buildMatchList()
if(dategiven) // user specified some sort of date information
{
- time_t now = time((time_t *)0); // fill in all fields for mktime
- tm *lt = localtime(&now); // - Gilles's fix
- startdate = *lt;
- enddate = *lt;
-
// set up the startdate structure
// see man mktime for details on the tm structure
startdate.tm_sec = 0;
@@ -1424,7 +1423,7 @@ Display::buildMatchList()
if (date_factor != 0.0)
{
date_score = date_factor *
- ((thisRef->DocTime() * 1000 / (double)time(0)) - 900);
+ ((thisRef->DocTime() * 1000.0 / (double)now) - 900);
score += date_score;
}
For 3.1.5, you can simply append the ".0" to 1000 as above, or if you've
already applied the dateRange.1 patch, here's the additional patch to
apply...
--- htsearch/Display.cc.patched 2001/06/19 22:07:58
+++ htsearch/Display.cc 2001/06/22 20:57:22
@@ -1100,6 +1100,10 @@ Display::buildMatchList()
tm startdate; // structure to hold the startdate specified by the user
tm enddate; // structure to hold the enddate specified by the user
+ time_t now = time((time_t *)0); // fill in all fields for mktime
+ tm *lt = localtime(&now); // - Gilles's fix
+ startdate = *lt;
+ enddate = *lt;
time_t eternity = ~(1<<(sizeof(time_t)*8-1)); // will be the largest value
holdable by a time_t
tm *endoftime; // the time_t eternity will be converted into a tm, held by
this variable
@@ -1121,11 +1125,6 @@ Display::buildMatchList()
if(dategiven) // user specified some sort of date information
{
- time_t now = time((time_t *)0); // fill in all fields for mktime
- tm *lt = localtime(&now); // - Gilles's fix
- startdate = *lt;
- enddate = *lt;
-
// set up the startdate structure
// see man mktime for details on the tm structure
startdate.tm_sec = 0;
@@ -1332,7 +1331,7 @@ Display::buildMatchList()
if (thisRef) // We better hope it's not null!
{
score += date_factor *
- ((thisRef->DocTime() * 1000 / (double)time(0)) - 900);
+ ((thisRef->DocTime() * 1000.0 / (double)now) - 900);
int links = thisRef->DocLinks();
if (links == 0)
links = 1; // It's a hack, but it helps...
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/htdig-dev