Also sprach Andrew Scherpbier (at 08:58 AM 7/2/98 -0700) ...
>> Say I have a website where the code:
>> 
>>         Sample<I>Code</I>
>> 
>> is all over.  That's the brandname - including the italics.  If I do an
>> htdig search for "SampleCode", I get no matches.
>> 
>> Shouldn't htdig strip out all the HTML?  Or is there a conf setting I need
>> to do this?
>
>htdig does strip out the HTML, but it has no knowledge of the semantics of
the
>HTML tags for those types of markups, so it assumes it is a word break.

Hrm... how about building in a list of which HTML tags should be considered
work breaks and which shouldn't?  Or just those that shouldn't, which is
probably the shorter list.

>Just out of curiosity, how do other search engines deal with this problem?

Don't know.  :)


.........................................................................
Colin Viebrock           Creative Director - Private World Communciations
[EMAIL PROTECTED]                          http://www.privateworld.com
ICQ: 11386088

                           Give a man a fish, and you feed him for a day.
             Teach him to use the Net, and he won't bother you for weeks.

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to