Moved to htdig3-dev from htdig list... According to Aaron Turner: > On Fri, 18 Jun 1999, Gilles Detillieux wrote: > > > I think your best bet would be to customize Display::buildMatchList() in > > htsearch/Display.cc to do what you want. This would allow you to weed out > > the duplicates you want to exclude before they're counted and paginated. > > We'll I had a friend who knows C and some C++ and he's gaging on the code. > Any pointers? Any chance of convincing any of developers that this is > important? :-) As stated, the problem seems rather unique to your circumstances, as far as I can tell. However, I could see a use for a more general purpose external match filter interface to htsearch. After building the list of matches, htsearch would pass the list of URLs (and possibly other information) to an external filter, customised by the site maintainer, which would decide what to keep and what to delete, and return to htsearch the pared down list (or a list of deletes). Writing such a filter may seem a less daunting task than customizing htsearch's buildMatchList() routine. Thoughts? Ideas? Takers? > The way I would envision this is two new params to be passed to htsearch. > First, 'uniqueid' which would be the name of another param in the cgi > string. example: > > /cgi-bin/file?id=124&c=1.2.4.6&uniqeid=id&uniqueroot=/cgi-bin/file&... > /cgi-bin/file?id=124&c=1.3.5.6&uniqeid=id&uniqueroot=/cgi-bin/file&... > /cgi-bin/file?id=127&c=1.3.5.6&uniqeid=id&uniqueroot=/cgi-bin/file&... > > htsearch would use the value of uniqueid and uniqueroot to determine > uniqueness of a URL. Any two hits that start with uniqueroot and has > the same value for the value of uniqueid (in this case 'id') is considered > a duplicate. In the case above, #1 and #2 are dupes, but #3 is unique. > > The whole point of this is for dynamic sites that use DB's as their > backend and something like PHP to access the content. I assume you're suggesting uniqueid and uniqueroot as input parameters to htsearch, or perhaps config file attributes, and not as parameters to the CGI script being indexed. Is that correct? If so, your example above is confusing. If not, I'm afraid you lost me. It doesn't make sense to me that htsearch would actually try to use parameters embedded inside URLs in the search results, as its own parameters. Regardless, this seems to me to be a pretty specific solution to a specific problem, and I don't see it being used generally. Of course, if you can find someone to code it for you, and it doesn't break anything, then I'm sure no one would have a problem with it being included in the code. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the SUBJECT of the message.
