Hi Sean,
I just ran another set of notes through cTAKES and noticed the following
error:
log4j: Setting property [conversionPattern] to [%d{dd MMM yyyy HH:mm:ss}
%5p %c{1} - %m%n].
log4j: Adding appender named [consoleAppender] to category [root].
29 Sep 2019 15:31:21 ERROR PiperFileReader - Piper File not found:
WindowedAttributeCleartkSubPipe
Is something missing? This is how my DefaultFastPipeline.piper file looks
(NB: I also tried load WindowedAttributeCleartkSubPipe.piper, with similar
results)
// Commands and parameters to create a default plaintext document
processing pipeline with UMLS lookup
// Load a simple token processing pipeline from another pipeline file
load DefaultTokenizerPipeline.piper
// Add non-core annotators
add ContextDependentTokenizerAnnotator
addDescription POSTagger
// Add Chunkers
load ChunkerSubPipe.piper
// Default fast dictionary lookup
add DefaultJCasTermAnnotator
// Add Cleartk Entity Attribute annotators
// see https://issues.apache.org/jira/browse/CTAKES-449
//load AttributeCleartkSubPipe.piper
load WindowedAttributeCleartkSubPipe
All files seem to have been processed fine, but wondering if something was
missed, due to the error. If so, how do I construct the
WindowedAttributeCleartkSubPipe.piper file?
Thanks very much in advance!
Greg--
On Tue, Sep 24, 2019 at 7:27 PM Greg Silverman <[email protected]> wrote:
> Sweet! That was definitely it! It's flying now (granted, our files are not
> in the > 1 mb realm, like it the jira issue - just in the nnn.kb realm, but
> still!).
>
> Mahalo nui loa!
>
>
>
> On Tue, Sep 24, 2019 at 6:29 PM Finan, Sean <
> [email protected]> wrote:
>
>> Hi Greg,
>>
>> Check your log to see what component is taking all the time.
>>
>> There is a known problem with the cleartk assertion annotators:
>>
>> https://issues.apache.org/jira/browse/CTAKES-449
>>
>> A partial fix was made in the "windowed" sub-package of ctakes-assertion:
>> org.apache.ctakes.assertion.medfacts.cleartk.windowed.
>>
>> Each of the normal assertion engines has a replacement in the windowed
>> package.
>>
>> If you are using a piper file that contains "load
>> AttributeCleartkSubPipe" as the Default clinical pipeline does, just
>> replace it with "load WindowedAttributeCleartkSubPipe".
>>
>> It isn't a full fix for the problem, and I don't know if it will make
>> your processing faster, but you can give it a try.
>>
>> Sean
>>
>> ________________________________________
>> From: Greg Silverman <[email protected]>
>> Sent: Tuesday, September 24, 2019 6:47 PM
>> To: [email protected]
>> Subject: Large files taking forever to process [EXTERNAL]
>>
>> Any suggestions on how to speed up processing large clinical text notes
>> approaching 13K lines? This is a very old corpus culled from EPIC notes
>> back in 2009. I thought about splitting the notes into smaller chunks, but
>> then I would have to deal with the offsets when analyzing system output
>> against manual annotations that had been done.
>>
>> As is, I've tried different garbage collection options (this seemed to
>> have
>> worked well with CLAMP on the same set of notes).
>>
>> TIA!
>>
>> Greg--
>>
>> --
>> Greg M. Silverman
>> Senior Systems Developer
>> NLP/IE <
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=kVCVyGR2m-zb7CsPmrrCeBL1N-9Z6tXZOp869xqkcBQ&s=TEirYUPMXTOjZ1PoJMxTXt7M8I5axwQI9zzNrvLmGRo&e=
>> >
>> Department of Surgery
>> University of Minnesota
>> [email protected]
>>
>> › evaluate-it.org ‹
>>
>
>
> --
> Greg M. Silverman
> Senior Systems Developer
> NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> Department of Surgery
> University of Minnesota
> [email protected]
>
> › evaluate-it.org ‹
>
--
Greg M. Silverman
Senior Systems Developer
NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
Department of Surgery
University of Minnesota
[email protected]
› evaluate-it.org ‹