That sounds bizarre! I can think of two possibilities: a sentence break in the middle of the word (unlikely), or the different sentence splits caused the POS tagger some confusion, and tagged the word aspirin as a forbidden part of speech, like a preposition or something. If you check the token annotation on the word aspirin you should be able to see the part of speech tag for that word. Tim
________________________________________ From: Tomasz Oliwa <ol...@uchicago.edu> Sent: Tuesday, March 13, 2018 5:34 PM To: dev@ctakes.apache.org Subject: Re: Sentence splitter [EXTERNAL] Hi, I tested SentenceDetectorAnnotatorBIO in cTAKES 4.0.0, simply by replacing SentenceDetectorAnnotator.xml with SentenceDetectorAnnotatorBIO.xml in AggregatePlaintextFastUMLSProcessor.xml. While it seemed to work, I noticed that in one example, an IdentifiedAnnotation was not found, that was found for the same input with just SentenceDetectorAnnotator.xml. Could somebody check this please? Run the cTAKES CVD with the following input (without the "): " aspirin his leg " On the machine I tested this, the MedicationMention does not show up with SentenceDetectorAnnotatorBIO, but it does with SentenceDetectorAnnotator. ________________________________________ From: Masoud Rouhizadeh <m...@jhu.edu> Sent: Tuesday, March 13, 2018 3:02:35 PM To: dev@ctakes.apache.org Subject: Re: Sentence splitter [EXTERNAL] Hi Sean, Thank you for the pointer. I was able to run the SentenceDetectorAnnotatorBIO from ctakes-core. The results are way better than the SentenceDetectorAnnotator but I still see some issues such as splitting “Dr.” as a separate sentence (most likely due to the period after the abbreviation). Do you think there is a way to define an abbreviation list for SentenceDetectorAnnotatorBIO so that it knows that this is a word-final (i.e. abbreviation-final) and not a sentence-final period? Thanks again, Masoud On 3/9/18, 5:35 PM, "Finan, Sean" <sean.fi...@childrens.harvard.edu> wrote: Hi Masoud, There is a very nice SentenceDetectorBIO in ctakes-core. It will split sentences based upon features other than just a newline character, which appears to be what you want. Sean ________________________________________ From: Masoud Rouhizadeh <m...@jhu.edu> Sent: Friday, March 9, 2018 4:41 PM To: dev@ctakes.apache.org Subject: Sentence splitter [EXTERNAL] Hello cTAKES team! I was wondering what types of sentence splitters are available in cTAKES? The default sentence splitter does not appear to be the best one. See output for the demo example from the example in cTAKES installation guide: Dr. Nutritious Medical Nutrition Therapy for Hyperlipidemia Referral from: Julie Tester, RD, LD, CNSD Phone contact: (555) 555-1212 Height: 144 cm Current Weight: 45 kg Date of current weight: 02-29-2001 Admit Weight: [...] Thanks so much, Masoud ---- Masoud Rouhizadeh, PhD NLP Specialist / Software Engineer Institute for Clinical and Translational Research Johns Hopkins University https://urldefense.proofpoint.com/v2/url?u=http-3A__pages.jh.edu_-7Emrouhiz1&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=aZ4yDE4zQbRJuUQ8p-T5nPrjhYvXF28sFoJWEtP3sGU&s=ob0U2sSfS7UijTI8PqCh_MwMucxPc14ovmcC2vq7rDA&e=