[EMAIL PROTECTED] wrote:
I guess there are a few points

- it is impossible to stem with total accuracy using rules alone

- combining a rule based stemmer with a dictionary could also be error
prone. Unrelated words can have the same stem - consider the past tense of
see and the stem of sawing ( cutting wood )

I guess you could solve some of these by stemming (actually, lemmatising is probably more accurate, since you'll almost certainly need part of speech information to do it with any accuracy) to the wordnet number, for which the meaning is unique.

But there are still ambiguous sentences where you don't know which word it is. To use the example just given, "I saw wood" is pretty ambiguous, without having more context.

Daniel


--
Daniel Noll

Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia    Ph: +61 2 9280 0699
Web: http://nuix.com/                               Fax: +61 2 9212 6902

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to