Sorry - I missed this:
> I'm using the two CDA files that come with the cTAKES package
> (testpatient_cn_2.xml and testpatient_cn_1.xml compatible with
> NotesIIST_RTF.DTD
Those files -should- be ok as they were originally used to test the CDA
workflow.
The code for CdaCasInitializer and ClinicalNotePreProcessor hasn't changed
since 2015.
The actual error is coming from the 3rd party xml parser (xerces):
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1;
Content is not allowed in prolog.
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
I am not sure what would be causing this.
I don't run CDA, so I can't speak to the operational status of those components
or the pipeline in general.
Does anybody else out there use CDA?
Sean
________________________________________
From: Finan, Sean <[email protected]>
Sent: Wednesday, December 18, 2019 2:22 PM
To: [email protected]
Subject: Re: cTAKES handling HL7 CDA Level 1 [EXTERNAL] [SUSPICIOUS]
* External Email - Caution *
Hi Masoud,
I am not an xml expert, so take this with a grain of salt.
I think that something is wrong/unmatched with the first line of your xml
document.
Make sure that the first line is something like:
<?xml version="1.0" encoding="utf-8"?>
Sean
________________________________________
From: Masoud Rouhizadeh <[email protected]>
Sent: Wednesday, December 18, 2019 1:47 PM
To: [email protected]
Subject: Re: cTAKES handling HL7 CDA Level 1 [EXTERNAL]
* External Email - Caution *
Hi all,
I'm using cTAKES user to process CDA documents by AggregateCdaProcessor.xml and
AggregateCdaUMLSProcessor.xml located in
/desc/ctakes-clinical-pipeline/desc/analysis_engine/
My script to call this is
java -Dctakes.umlsuser= -Dctakes.umlspw= -cp
$CTAKES_HOME/lib/*:$CTAKES_HOME/desc/:$CTAKES_HOME/resources/
-Dlog4j.configuration=file:$CTAKES_HOME/config/log4j.xml -Xms2g -Xmx3g
org.apache.ctakes.core.cpe.CmdLineCpeRunner
$CTAKES_HOME/desc/ctakes-clinical-pipeline/desc/collection_processing_engine/test_cda_masoud.xml
test_cda_masoud.xml has a proper path to CDA input and output. I'm using the
two CDA files that come with the cTAKES package (testpatient_cn_2.xml and
testpatient_cn_1.xml compatible with NotesIIST_RTF.DTD).
Unfortunately, it seems that CdaCasInitializer cannot run, and I get the
attached errors. I get the same errors when using the GUI with
AggregateCdaProcessor AE
- Am I missing something obvious?
- Does cTAKES *User* installation handle CDA documents?
- Is org.apache.ctakes.core.cpe.CmdLineCpeRunner an appropriate pipeline for
CdaCasInitializer?
Thank you so much for your help in advance.
Masoud
On 11/8/19, 8:30 AM, "Finan, Sean" <[email protected]> wrote:
Hi Masoud,
I think that the CdaCasInitializer is at least 10 years old. I would not
expect it to conform to any recent standards.
Does anybody else have a reader or transformer that can handle HL7 CDA r2?
Sean
p.s.
If anybody is involved with HL7 International, you may want to get some
movement on addressing the typo on the page header(s):
Section 1a: Clinical Document Architcture (CDA®)
________________________________________
From: Masoud Rouhizadeh <[email protected]>
Sent: Thursday, November 7, 2019 5:59 PM
To: [email protected]
Subject: cTAKES handling HL7 CDA Level 1 [EXTERNAL]
Dear cTAKES developer mailing list,
We have been working on a project at Hopkins for converting Epic-generated
RTF notes into Clinical Document Architecture Level One.
We have been using HL7 CDA® Release 2 Schema, and now we plan to use cTAKES
for concept extraction from those documents. The CDA Schema and examples can be
found here
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.hl7.org_implement_standards_product-5Fbrief.cfm-3Fproduct-5Fid-3D7&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=h8q4BiKKL6eDBOGEta7gcpkDGIx5xFPlGrNfUPlzBuc&s=l8HjgDHeywmdkSUkOJBGWNLpJ-bPlw7Lmgzh02w8k2s&e=
In the cTAKES documentation, I see that CdaCasInitializer "does not handle
all CDA documents. The CDA document must conform to the DTD
resources/cda/NotesIIST_RTF.DTD."
Has anyone tested and evaluated cTAKES ability to consume HL7 CDA Level 1
Release 2 documents?
Thank you,
Masoud
----
Masoud Rouhizadeh, PhD
Faculty - Division of Health Science Informatics (DHSI)
NLP Lead - Institute for Clinical and Translational Research (ICTR)
Johns Hopkins University School of Medicine
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cs.jhu.edu_-7Emrou_&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=h8q4BiKKL6eDBOGEta7gcpkDGIx5xFPlGrNfUPlzBuc&s=8fvrQoIy8orWYKCJoob5Z0Sbbioe5xyiN7pDMTzImOc&e=