Re: New to cTakes and need help

Sajit Kumar Sun, 03 Feb 2019 20:48:20 -0800

Hi Gandhi,

Thanks for your reply.


Let me elaborate my task. I need to extract the CUIs for the medical
concepts in clinical notes. I believe cTakes can be a very good tool for
this.
I went through the default clinical pipeline. The pipeline is has various
tasks in the pipe such as boundary detection, tokenization, entity
recognition etc. I believe the output XMI document also has sections for
each of these tasks. However, as I am only interested in the CUIs for
medical concepts. I would only be interested in entity recognition and
entity properties in my output. Is it possible to create a custom pipeline
based on this. Or is it possible to turn off the output of unwanted
sections. I hope you understand what I am trying to say. Please advise.

Also is there any documentation on the structure of cTAKES output files
such as XMI files.

Looking forward to your response.

Regards,
Sajit

On Sun, 3 Feb, 2019, 20:48 gandhi rajan <[email protected] wrote:

> Hi Sajit, I would say default clinical pipeline is the best place to start
> for a beginner -
> https://cwiki.apache.org/confluence/display/CTAKES/Default+Clinical+Pipeline
>
> Also you got to elaborate what information you are looking for when you
> say many of the information are irrelevant for you.
>
> On Sunday, February 3, 2019, Sajit Kumar <[email protected]> wrote:
>
>> Hi All,
>>
>> I am new to cTakes. I have heard great things about cTakes in processing
>> clinical notes. I have been able to successfully install and launch cTakes
>> applications. However, I have not been able to find enough documentation
>> for the XMI output from these applications such as CPE etc. If anyone can
>> guide me to some documentation to understand the structure of these outputs
>> that would be helpful.
>>
>> Additionally, I am working on a task where i am interested in extracting
>> the UMLS, SNOMED medical concepts from the clinical notes. However, i see
>> that the output usually has lot of information that is not relevant to my
>> task. I tried my hands at creating a custom pipeline to get rid of this
>> information. But it was throwing an exception. Please find below the
>> script.
>>
>> //       ***  Piper File  ***
>> //       Created by Sajit
>> //       on February 03, 2019
>>
>>
>> //  Text Files Reader
>> //  Reads document texts from text files specified in a provided list.
>> #   files  The text files to be loaded
>> reader org.apache.ctakes.core.cr.TextReader
>> files=C:\apache-ctakes-4.0.0\testdata\Input\SampleInputRadiologyNotes.txt
>>
>> //  UMLS Dictionary Lookup (Old)
>> //  Annotates clinically-relevant terms.  This is an older, slower
>> dictionary lookup implementation.
>> add org.apache.ctakes.dictionary.lookup.ae.UmlsDictionaryLookupAnnotator
>>
>> //  XMI Writer
>> //  Writes XMI files with full representation of input text and all
>> extracted information.
>> #   OutputDirectory  Output directory to write xmi files
>> add org.apache.ctakes.core.cc.XmiWriterCasConsumerCtakes
>> OutputDirectory=C:\apache-ctakes-4.0.0\testdata\output
>>
>> This passes the validation but fails to execute.
>> Please tell me if my approach is right or wrong. And is it possible to
>> trim the XMI outputs based on ones need in the cTakes tool.
>>
>> Any suggestion or help is most welcome. Thanks.
>>
>> Regards,
>> Sajit
>>
>
>
> --
> Regards,
> Gandhi
>
> "The best way to find urself is to lose urself in the service of others
> !!!"
>
>

Re: New to cTakes and need help

Reply via email to