The following module was proposed for inclusion in the Module List:
modid: Lingua::En::Tagger
DSLIP: adpOg
description: Part of speech tagger for English
userid: MACIEJ (Maciej Ceglowski)
chapterid: 11 (String_Lang_Text_Proc)
communities:
Part of a toolkit for latent semantic indexing ( LSI ) to be
introduced at the O'Reilly Bioinformatics Conference in February
similar:
None that I can find - there are some modules under Lingua and
Lingua::En that deal with a specific subset of the funcionality (
finding proper names, for example ), but no full POS tagger
rationale:
Being able to auto-tag English text with part of speech information
is very helpful in building all kinds of natural language processing
tools, and especially search engines ( where you want noun phrases,
and little else ). There exist several tagging algorithms for
English text, but none of them seem to be implemented in Perl ( at
least not under an open source license, or on the CPAN ).
I've been using a POS tagger for my own work on latent semantic
indexing, and this seems like a good module to abstract out so that
other CPAN users can benefit.
For now the module is something of a homebrew - but we intend to
extend support to standard algorithms like Brill's tagger and
others, in future versions.
I think the namespace is appropriate, since this is ipso facto
specific to English, but I welcome other suggestions.
enteredby: MACIEJ (Maciej Ceglowski)
enteredon: Sun Oct 20 21:38:23 2002 GMT
The resulting entry would be:
Lingua::En::
::Tagger adpOg Part of speech tagger for English MACIEJ
Thanks for registering,
The Pause Team
PS: The following links are only valid for module list maintainers:
Registration form with editing capabilities:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=53300000_6d9f60bdac7db8ba&SUBMIT_pause99_add_mod_preview=1
Immediate (one click) registration:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=53300000_6d9f60bdac7db8ba&SUBMIT_pause99_add_mod_insertit=1