Hi Gandhi, Thanks for your reply.
Let me elaborate my task. I need to extract the CUIs for the medical concepts in clinical notes. I believe cTakes can be a very good tool for this. I went through the default clinical pipeline. The pipeline is has various tasks in the pipe such as boundary detection, tokenization, entity recognition etc. I believe the output XMI document also has sections for each of these tasks. However, as I am only interested in the CUIs for medical concepts. I would only be interested in entity recognition and entity properties in my output. Is it possible to create a custom pipeline based on this. Or is it possible to turn off the output of unwanted sections. I hope you understand what I am trying to say. Please advise. Also is there any documentation on the structure of cTAKES output files such as XMI files. Looking forward to your response. Regards, Sajit On Sun, 3 Feb, 2019, 20:48 gandhi rajan <[email protected] wrote: > Hi Sajit, I would say default clinical pipeline is the best place to start > for a beginner - > https://cwiki.apache.org/confluence/display/CTAKES/Default+Clinical+Pipeline > > Also you got to elaborate what information you are looking for when you > say many of the information are irrelevant for you. > > On Sunday, February 3, 2019, Sajit Kumar <[email protected]> wrote: > >> Hi All, >> >> I am new to cTakes. I have heard great things about cTakes in processing >> clinical notes. I have been able to successfully install and launch cTakes >> applications. However, I have not been able to find enough documentation >> for the XMI output from these applications such as CPE etc. If anyone can >> guide me to some documentation to understand the structure of these outputs >> that would be helpful. >> >> Additionally, I am working on a task where i am interested in extracting >> the UMLS, SNOMED medical concepts from the clinical notes. However, i see >> that the output usually has lot of information that is not relevant to my >> task. I tried my hands at creating a custom pipeline to get rid of this >> information. But it was throwing an exception. Please find below the >> script. >> >> // *** Piper File *** >> // Created by Sajit >> // on February 03, 2019 >> >> >> // Text Files Reader >> // Reads document texts from text files specified in a provided list. >> # files The text files to be loaded >> reader org.apache.ctakes.core.cr.TextReader >> files=C:\apache-ctakes-4.0.0\testdata\Input\SampleInputRadiologyNotes.txt >> >> // UMLS Dictionary Lookup (Old) >> // Annotates clinically-relevant terms. This is an older, slower >> dictionary lookup implementation. >> add org.apache.ctakes.dictionary.lookup.ae.UmlsDictionaryLookupAnnotator >> >> // XMI Writer >> // Writes XMI files with full representation of input text and all >> extracted information. >> # OutputDirectory Output directory to write xmi files >> add org.apache.ctakes.core.cc.XmiWriterCasConsumerCtakes >> OutputDirectory=C:\apache-ctakes-4.0.0\testdata\output >> >> This passes the validation but fails to execute. >> Please tell me if my approach is right or wrong. And is it possible to >> trim the XMI outputs based on ones need in the cTakes tool. >> >> Any suggestion or help is most welcome. Thanks. >> >> Regards, >> Sajit >> > > > -- > Regards, > Gandhi > > "The best way to find urself is to lose urself in the service of others > !!!" > >
