On Sat, 26 Jun 2004, Stephen Church wrote: > I am a non-IT end-user of HT DIG search software 3.1.6 installled by my Japanese IT > specialist last week. There is a database of 400 financial interviews including > frequent use > of the finance technical term M&A ie a non-standard ligature including the > ampersand. > >From the HT DIG website I understand that the ampersand, &, presents search > >problems > and in fact Search "M&A" returns no matches. Is there a simple solution to this > problem?
There are a couple things that you might try. First, setting the minimum word length to 2 should allow for searches on M&A; with a default config the '&' is stripped so what the indexer and search interface actually see is 'MA'. This option will almost certainly bloat your databases to some degree since all two-letter terms will now be included. Another option is to remove '&' from valid_punctuation and add it to extra_word_characters. For example, valid_punctuation:-_/!#$%^' extra_word_characters: & I think the first option is the better choice if you can afford a bit of bloat in your databases. See the following for further information on the attributes mentioned above. http://www.htdig.org/attrs.html#minimum_word_length http://www.htdig.org/attrs.html#valid_punctuation http://www.htdig.org/attrs.html#extra_word_characters ------------------------------------------------------- This SF.Net email sponsored by Black Hat Briefings & Training. Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

