According to mike grommet:
>
>
> Sorry for the long post. the buildMatchList routine is quite long.
> At least I didnt send the entire Display.cc file :)
>
> This is the buildMatchList routine, with some modifications by me to allow
> for date range searches.
> I've got the basic jist of what I need to do, but I'm stuck and not quite
> sure how to proceed.
> I think my main problem is that I am not as familiar with c++ as I need to
> be.
>
> The source below gives this error message when I compile
>
> Display.cc: In method `class List * Display::buildMatchList()':
> Display.cc:1097: no member function `ResultMatch::DocTime()' defined
> Display.cc:1097: no member function `ResultMatch::DocTime()' defined
> Display.cc:1098: parse error before `->'
> Display.cc:1108: confused by earlier errors, bailing out
> *** Error code 1
The problem is that DocTime() is a method for the DocumentRef class,
not the ResultMatch class, so you can't use thisMatch->DocTime().
You must use thisRef->DocTime().
> Below I have marked the spot in the code where it is dying...
>
> just search for
> // THIS CODE DOES NOT WORK AT ALL!!!!!
>
> I know why it doesnt work, apparently this code needs to take place
> somewhere in the
> if (date_factor != 0.0 || backlink_factor != 0.0 || typ != SortByScore)
> {}
>
> section where the DocTime() method is actually useable by the DocumentRef
> *thisRef, but I'm not really
> certain what the if section above is actually doing, and I'm not sure how
> the date range would fit in, or even if it does.
The purpose of this if statement is to avoid fetching the DocumentRef
if we don't need it, because right now it's a fairly expensive operation.
Your code would need to go in there, and a test on the date ranges
should be included in the if.
> Could someone
> 1) provide some help and clues here :)
> 2) look over my logic and see what you think.
>
> I know its sort of messy. I haven't even begun to cross the issue of time
> zones...
> I just wanted to get something functioning on a date range and then go back
> and take care
> of the small details.
>
>
> Ok, so lemmie have it, I'm wearing my asbestos undies today :)
>
>
> //**************************************************************************
> ***
> List *
> Display::buildMatchList()
> {
> char *id;
> String coded_url, url;
> ResultMatch *thisMatch;
> List *matches = new List();
> double backlink_factor = config.Double("backlink_factor");
> double date_factor = config.Double("date_factor");
> SortType typ = sortType();
>
>
> // Additions made here by Mike Grommet 4-1-99
>
> tm startdate;
> tm enddate;
>
> time_t timet_startdate;
> time_t timet_enddate;
> int monthdays[] = {31,28,31,30,31,30,31,31,30,31,30,31};
>
> // tm structure looks like this
> //
> // int tm_sec; seconds (0 - 60)
> // int tm_min; minutes (0 - 59)
> // int tm_hour; hours (0 - 23)
> // int tm_mday; day of month (1 - 31)
> // int tm_mon; month of year (0 - 11)
> // int tm_year; year - 1900
> // int tm_wday; day of week (Sunday = 0)
> // int tm_yday; day of year (0 - 365)
> // int tm_isdst; is summer time in effect?
> // char *tm_zone; abbreviation of timezone name
>
>
> // set up the startdate structure
>
> startdate.tm_sec = 0;
> startdate.tm_min = 0;
> startdate.tm_hour = 0;
>
>
> // The concept here is that if a user did not specify a part of a date,
> then we will
> // make assumtions...
> // For instance, suppose the user specified Feb, 1999 as the start range
> // we take steps to make sure that the search range date starts at Feb
> 1, 1999
> // along these same lines: (these are in MM-DD-YYYY format)
> // Startdates: Date Becomes
> // 01-01 01-01-1970
> // 01-1970 01-01-1970
> // 04-1970 04-01-1970
> // 1970 Becomes 01-01-1970
> // These things seem to work fine for start dates, as all months have
> the same first day
> // however the ending date can't work this way.
>
>
>
> if(config.Value("startmonth") > 0) // form
> input specified a start month
> {
> startdate.tm_mon = config.Value("startmonth") - 1; // tm
> months are zero based. They are passed in as 1based
> }
> else startdate.tm_mon = 0; //
> otherwise, no start month, default to 0
>
>
>
> if(config.Value("startday") > 0) // form
> input specified a start day
> {
> startdate.tm_mday = config.Value("startday"); // tm
> days are 1 based, they are passed in as 1 based
> }
> else startdate.tm_mday = 1; //
> otherwise, no start day, default to 1
>
>
> // year is handled a little differently... the tm_year structure
> // wants the tm_year in a format of year - 1900.
> // since we are going to convert these dates to a time_t,
> // a time_t value of zero, the earliest possible date
> // occurs jan 1, 1970. If we allow dates < 1970, then we
> // could get negative time_t values right???
>
> if(config.Value("startyear") > 1970) // form
> input specified a start year
> {
> startdate.tm_year = config.Value("startyear") - 1900;
> }
> else startdate.tm_year = 1970 - 1900; //
> otherwise, no start day, specify start at 1970
>
>
> // if all startmonth,startday,startyear are not passed in from the
> form
> // then we want to create the date 01-01-1970
>
>
>
> // set up the enddate structure
>
> // because of some special cases,we do things in a slightly different
> order here
> // See below, on setting the enddate's ending day
>
> enddate.tm_sec = 0;
> enddate.tm_min = 0;
> enddate.tm_hour = 0;
>
> if(config.Value("endmonth") > 0) // form
> input specified an end month
> {
> enddate.tm_mon = config.Value("endmonth") - 1; // tm
> months are zero based. They are passed in as 1 based
> }
> else enddate.tm_mon = 11; //
> otherwise, no end month, default to 11
>
>
> if(config.Value("endyear") >= 1970) // form
> input specified a end year
> {
> enddate.tm_year = config.Value("endyear") - 1900;
> }
> else enddate.tm_year = 2038 - 1900; //
> otherwise, no end year, specify end at 2038
>
>
> // Months have different number of days, and this makes things more
> complicated than the startdate range.
> // Following the example above, here is what we want to happen:
> // Enddates: Date Becomes
> // 04-31 04-31-2038, but this must be
> converted to 01-19-2038
> // 05-1999 05-31-1999, may has 31 days... we
> want to search until the end of may so...
> // 1999 12-31-1999, search until the end of
> the year
>
> if(config.Value("endday") > 0) // form
> input specified an end day
> {
> enddate.tm_mday = config.Value("endday"); // tm days
> are 1 based, they are passed in as 1 based
> }
> else enddate.tm_mday = monthdays[enddate.tm_mon]; //
> otherwise, no start day, default to the end of the month
>
> // now a possiblity that could also happen is that someone specifies
> // a date like February 29, 1999, which could be invalid.
> // I am told that a time_t value will take this into account and
> translate a February 29, 1999 date
> // and translate it to March 1, 1999 if needed.
>
>
> // We need to check for the possibility of creating a negative
> time_t. This could happen if someone
> // specified a date like Jan 20, 2038... so:
>
> // does the end date fall out of range for a time_t?
> // if so set the appropriate limits
>
> if((enddate.tm_year >=(2038-1900)) && (enddate.tm_mon >= 0))
>
>
> enddate.tm_mon = 0;
> enddate.tm_year = 2038-1900;
>
> if (enddate.tm_mday > 19) enddate.tm_mday = 19;
> }
I'd want to change the boundary conditions for endate so it's independent
of current time_t limitations on 32-bit systems, but for now, let's get
the main part working...
>
> // convert the tm values into time_t values;
>
> timet_startdate=mktime(&startdate);
> timet_enddate=mktime(&enddate);
>
> // what if the user did something really goofy like choose an end date
> thats before the start date
>
> if(timet_enddate < timet_startdate)
> {
> time_t timet_temp = timet_enddate;
>
> timet_enddate = timet_startdate;
> timet_startdate = timet_enddate;
> }
>
>
>
>
> results->Start_Get();
> while ((id = results->Get_Next()))
> {
> //
> // Convert the ID to a URL
> //
> if (docIndex->Get(id, coded_url) == NOTOK)
> {
> continue;
> }
>
> // No special precations re: the option
> // "uncoded_db_compatible" needs to be taken.
> url = HtURLCodec::instance()->decode(coded_url);
> if (!includeURL(url.get()))
> {
> continue;
> }
>
>
> thisMatch = new ResultMatch();
> thisMatch->setURL(url);
> thisMatch->setRef(NULL);
>
> //
> // Get the actual document record into the current ResultMatch
> //
> // thisMatch->setRef(docDB[thisMatch->getURL()]);
>
> //
> // Assign the incomplete score to this match. This score was
> // computed from the word database only, no excerpt context was
> // known at that time, or info about the document itself,
> // so this still needs to be done.
> //
> DocMatch *dm = results->find(id);
> double score = dm->score;
>
> // We need to scale based on date relevance and backlinks
> // Other changes to the score can happen now
> // Or be calculated by the result match in getScore()
>
> // This formula derived through experimentation
> // We want older docs to have smaller values and the
> // ultimate values to be a reasonable size (max about 100)
>
>
> if (date_factor != 0.0 || backlink_factor != 0.0 || typ != SortByScore)
OK, we need to do date range tests in here, so let's see if we need to
check the date:
if (date_factor != 0.0 || backlink_factor != 0.0 || typ != SortByScore
|| timet_startdate > 0 || enddate.tm_year < 2038-1900)
> {
> DocumentRef *thisRef = docDB[thisMatch->getURL()];
> if (thisRef) // We better hope it's not null!
> {
Now, here's where we test the date, and if we reject the match, we must
clean up:
if (thisRef->DocTime() < timet_startdate ||
thisRef->DocTime() > timet_enddate)
{
delete thisMatch;
delete thisRef;
continue;
}
> score += date_factor *
> ((thisRef->DocTime() * 1000 / (double)time(0)) - 900);
> int links = thisRef->DocLinks();
> if (links == 0)
> links = 1; // It's a hack, but it helps...
> score += backlink_factor
> * (thisRef->DocBackLinks() / (double)links);
> if (score <= 0.0)
> score = 0.0;
> if (typ != SortByScore)
> {
> DocumentRef *sortRef = new DocumentRef();
> sortRef->DocTime(thisRef->DocTime());
> if (typ == SortByTitle)
> sortRef->DocTitle(thisRef->DocTitle());
> thisMatch->setRef(sortRef);
> }
> }
> // Get rid of it to free the memory!
> delete thisRef;
> }
>
> thisMatch->setIncompleteScore(score);
> thisMatch->setAnchor(dm->anchor);
>
> //
> // Append this match to our list of matches.
> //
>
At this point, we've already rejected the out of range matches, so we can
unconditionally add the match to the list as before.
>
> // THIS CODE DOES NOT WORK AT ALL!!!!!
> // does the match fit within our date range?
>
> if((timet_startdate <= thisMatch->DocTime() &&
> (thisMatch->DocTime() <= timet_enddate))
> matches->Add(thisMatch);
> }
>
> //
> // The matches need to be ordered by relevance level.
> // Sort it.
> //
> sort(matches);
>
> return matches;
> }
>
> //**************************************************************************
> ***
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.