[ 
https://issues.apache.org/jira/browse/CTAKES-96?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pei Chen updated CTAKES-96:
---------------------------

    Fix Version/s:     (was: future enhancement)
                   3.1-incubating
    
> Update Dependency Parser and Semantic Role Labeler - Thanks Jinho Choi and 
> Lee Beecker
> --------------------------------------------------------------------------------------
>
>                 Key: CTAKES-96
>                 URL: https://issues.apache.org/jira/browse/CTAKES-96
>             Project: cTAKES
>          Issue Type: New Feature
>            Reporter: Pei Chen
>             Fix For: 3.1-incubating
>
>
> Update/create new wrappers for ClearNLP that have been trained on clinical 
> notes (SHARP/MiPACQ).
> Some notes:
> the integration will be mostly switching to cTAKES types.
> Here are a few critical spots:
> In the tokenizer 
> (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/Tokenizer.java),
>  lines 96 and 106 are all that should need changing to switch to cTAKES 
> Sentence and Token types.
> In the pos-tagger 
> (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/PosTagger.java)
>  most of the changes should be lines 109 and 116-118
> In the MP Analyzer 
> (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/MPAnalyzer.java)
>  the changes would be lines 122-124 to again use the cTAKES toke types.
> The Dependency Parser 
> (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/DependencyParser.java)
>  is a bit harder, but similar.  I think you can step through and find 
> instances of ClearTK types and swap them for the Dependency Relation types in 
> cTAKES.  Basically the code grabs the token, POS, and lemma data from the CAS 
> and passes it onto Jinho's SRL.  Then the work is in mapping that output back 
> into CAS appropriate types.
> The Semantic Role Labeler 
> (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/SemanticRoleLabeler.java)
>  follows a similar flow.  But also pulls out Dependency Parse information 
> from the CAS.  Then the work is in extracting the SRL arguments and 
> predicates to put back into ClearTK CAS types.
> Lastly to get any idea of how these components are called in a UIMA pipeline, 
> I would refer to the test cases, especailly the ClearNLP test case 
> (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/test/java/org/cleartk/clearnlp/ClearNLPTest.java)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to