The following module was proposed for inclusion in the Module List: modid: Text::TEng DSLIP: cdpOp description: Framework for text mining scientific papers userid: ANDREFS (André Fernandes dos Santos) chapterid: 11 (String_Lang_Text_Proc) communities: Soon on GitHub (github.com/andrefs)
similar: Text::Mining rationale: In the context of my master thesis in Bioinformatics, I'm currently developing a framework for extracting numeric parameters (i.e. tuples formed by numeric expression + parameter name + unit of measure) and other relevant information from scientific literature. Despite being tested mainly with biomedical-related domains, it should be able to be used in other areas such as physics or economics, because the recognition of numeric expressions is domain-independent, and the recognition of domain specific entities is performed based on an domain-specific user-provided ontology. From the start the project was dubbed T-Eng (from Text Engineering), and initially I had planned to call the distribution Bio::TEng. However, given its broader scope, I now think that Text::TEng is a better option. The only problem I see is the existance of Teng, a "very simple DBI wrapper/ORMapper". Is Text::TEng acceptable? I could also name it Text::Eng, is this a better option? Thanks, André enteredby: ANDREFS (André Fernandes dos Santos) enteredon: Thu May 10 14:43:34 2012 GMT The resulting entry would be: Text:: ::TEng cdpOp Framework for text mining scientific papers ANDREFS Thanks for registering, -- The PAUSE PS: The following links are only valid for module list maintainers: Registration form with editing capabilities: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=18a00000_06db993f5b869e2b&SUBMIT_pause99_add_mod_preview=1 Immediate (one click) registration: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=18a00000_06db993f5b869e2b&SUBMIT_pause99_add_mod_insertit=1 Peek at the current permissions: https://pause.perl.org/pause/authenquery?pause99_peek_perms_by=me&pause99_peek_perms_query=Text%3A%3ATEng