Hi, I am trying to batch process the clinical notes from MIMIC database. I tried using the CPE tool provided with cTakes for this.I am only interested in extracting the CUIs from the notes. So i created a CPE file using CuisOnlyUMLSProcessor as the CAS Consumer. The XML output using FileWriterCasConsumer works without a problem.
However, I wanted the output as XMI instead of the XML as provided by the standard templates. For this i tried using the writer xml's provided in \desc\ctakes-core\desc\cas_consumer directory. I also tried changing the implementation name in these files to org.apache.uima.tools.components.XmiWriterCasConsumer. I get exceptions and am not able to generate a XMI output. I have seen people suggesting to use the Default Clinical pipeline to do a batch processing. However, this has 2 drawbacks. I am stuck using the standard processors and end up with lot of details that i dont need. Also the CPE configuration file provides a multi-threading option thereby help speed up the process. I have close to 200K files to process. Please help to resolve this issue. Thank you. Regards, Sajit
