On Sat, 26 Jun 2004, Stephen Church wrote:

> I am a non-IT end-user of HT DIG search software 3.1.6 installled by my Japanese IT 
> specialist last week. There is a database of 400 financial interviews including 
> frequent use 
> of the finance technical term M&A ie a non-standard ligature including the 
> ampersand. 
> >From the HT DIG website I understand that the ampersand, &, presents search 
> >problems 
> and in fact Search "M&A" returns no matches. Is there a simple solution to this 
> problem?

There are a couple things that you might try. First, setting the minimum
word length to 2 should allow for searches on M&A; with a default config
the '&' is stripped so what the indexer and search interface actually see
is 'MA'. This option will almost certainly bloat your databases to some
degree since all two-letter terms will now be included. Another option is
to remove '&' from valid_punctuation and add it to extra_word_characters.
For example,

valid_punctuation:-_/!#$%^'
extra_word_characters: &

I think the first option is the better choice if you can afford a bit of
bloat in your databases.

See the following for further information on the attributes mentioned
above.

http://www.htdig.org/attrs.html#minimum_word_length
http://www.htdig.org/attrs.html#valid_punctuation
http://www.htdig.org/attrs.html#extra_word_characters


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to