Bonjour David,
Le 30 juil. 08 à 16:08, David Bovill a écrit :
Is there a resource/ index that any one knows of for plain
uninteresting
dull words. I want to take arbitrary chunks of text and search for
"interesting" words - that is domain specific words that might be
useful to
links to create dictionary entries. This would mean creating a list
of words
and stripping "the" "it" etc. I am imagining it working like a
spelling
dictionary with the ability to manually edit entries - but I'd like
a good
starting list? Not sure what to search for :)
1. You might search for what is called 'stopwords' (non interesting
words) using any Internet search engine.
2. Have a look also at what is called 'stemming': http://
www.comp.lancs.ac.uk/computing/research/stemming/general/ that allow
to reduce different words to the same form.
3. I have put on RevOnline an English, French, Italian, Spanish,
German and Portuguese stemmer library (username: sosmartsoftware)
that could help you too.
Best regards from Paris,
Eric Chatonet.
----------------------------------------------------------------
Plugins and tutorials for Revolution: http://www.sosmartsoftware.com/
Email: [EMAIL PROTECTED]/
----------------------------------------------------------------
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution