>>1 - I'm a bit concerned that reasonable stemming (Porter/Snowball) >>apparently produces non-word stems .. i.e. not really human readable.
It is possible to derive the human-readable form of a stemmed term using either re-analysis of indexed content or TermPositionVector. Either of these techniques should give you the position data required to discover the original form. The highlighter package is one example of where this technique is used. Cheers Mark ___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]